Spaces:

shreyask
/

qmd-web

Running

App Files Files Community

shreyask Claude Opus 4.6 commited on Mar 12

Commit

6534024

verified ·

1 Parent(s): 1b6b223

fix: add eval-docs to root for HF static serving

Browse files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (3) hide show

eval-docs/api-design-principles.md +73 -0
eval-docs/distributed-systems-overview.md +92 -0
eval-docs/machine-learning-primer.md +125 -0

eval-docs/api-design-principles.md ADDED Viewed

	@@ -0,0 +1,73 @@

+# API Design Principles
+## Introduction
+Good API design is crucial for developer experience. This document outlines the core principles we follow when designing REST APIs.
+## Principle 1: Use Nouns, Not Verbs
+URLs should represent resources, not actions. Use HTTP methods to indicate the action.
+**Good:**
+- GET /users/123
+- POST /orders
+- DELETE /products/456
+**Bad:**
+- GET /getUser?id=123
+- POST /createOrder
+- GET /deleteProduct/456
+## Principle 2: Use Plural Nouns
+Always use plural nouns for consistency.
+- /users (not /user)
+- /orders (not /order)
+- /products (not /product)
+## Principle 3: Hierarchical Relationships
+Express relationships through URL hierarchy.
+- GET /users/123/orders - Get all orders for user 123
+- GET /users/123/orders/456 - Get specific order 456 for user 123
+## Principle 4: Filtering and Pagination
+Use query parameters for filtering, sorting, and pagination.
+- GET /products?category=electronics&sort=price&page=2&limit=20
+## Principle 5: Versioning
+Always version your APIs. We prefer URL versioning.
+- /v1/users
+- /v2/users
+## Principle 6: Error Handling
+Return consistent error responses with appropriate HTTP status codes.
+```json
+{
+  "error": {
+    "code": "VALIDATION_ERROR",
+    "message": "Email format is invalid",
+    "field": "email"
+  }
+}
+```
+## Principle 7: Rate Limiting
+Implement rate limiting and communicate limits via headers:
+- X-RateLimit-Limit: 1000
+- X-RateLimit-Remaining: 999
+- X-RateLimit-Reset: 1640000000
+## Conclusion
+Following these principles leads to APIs that are intuitive, consistent, and easy to maintain. Remember: the best API is one that developers can use without reading documentation.

eval-docs/distributed-systems-overview.md ADDED Viewed

	@@ -0,0 +1,92 @@

+# Distributed Systems: A Practical Overview
+## What Makes a System "Distributed"?
+A distributed system is a collection of independent computers that appears to users as a single coherent system. The key challenges arise from:
+1. **Partial failure** - Parts of the system can fail independently
+2. **Unreliable networks** - Messages can be lost, delayed, or duplicated
+3. **No global clock** - Different nodes have different views of time
+## The CAP Theorem
+Eric Brewer's CAP theorem states that a distributed system can only provide two of three guarantees:
+- **Consistency**: All nodes see the same data at the same time
+- **Availability**: Every request receives a response
+- **Partition tolerance**: System continues operating despite network partitions
+In practice, network partitions happen, so you're really choosing between CP and AP systems.
+### CP Systems (Consistency + Partition Tolerance)
+- Examples: ZooKeeper, etcd, Consul
+- Sacrifice availability during partitions
+- Good for: coordination, leader election, configuration
+### AP Systems (Availability + Partition Tolerance)
+- Examples: Cassandra, DynamoDB, CouchDB
+- Sacrifice consistency during partitions
+- Good for: high-throughput, always-on services
+## Consensus Algorithms
+When nodes need to agree on something, they use consensus algorithms.
+### Paxos
+- Original consensus algorithm by Leslie Lamport
+- Notoriously difficult to understand and implement
+- Foundation for many other algorithms
+### Raft
+- Designed to be understandable
+- Used in etcd, Consul, CockroachDB
+- Separates leader election from log replication
+### PBFT (Practical Byzantine Fault Tolerance)
+- Handles malicious nodes
+- Used in blockchain systems
+- Higher overhead than crash-fault-tolerant algorithms
+## Replication Strategies
+### Single-Leader Replication
+- One node accepts writes
+- Followers replicate from leader
+- Simple but leader is bottleneck
+### Multi-Leader Replication
+- Multiple nodes accept writes
+- Must handle write conflicts
+- Good for multi-datacenter deployments
+### Leaderless Replication
+- Any node accepts writes
+- Uses quorum reads/writes
+- Examples: Dynamo-style databases
+## Consistency Models
+From strongest to weakest:
+1. **Linearizability** - Operations appear instantaneous
+2. **Sequential consistency** - Operations appear in some sequential order
+3. **Causal consistency** - Causally related operations appear in order
+4. **Eventual consistency** - Given enough time, all replicas converge
+## Partitioning (Sharding)
+Distributing data across nodes:
+### Hash Partitioning
+- Hash key to determine partition
+- Even distribution
+- Range queries are inefficient
+### Range Partitioning
+- Ranges of keys on different nodes
+- Good for range queries
+- Risk of hot spots
+## Conclusion
+Building distributed systems requires understanding these fundamental concepts. Start simple, add complexity only when needed, and always plan for failure.

eval-docs/machine-learning-primer.md ADDED Viewed

	@@ -0,0 +1,125 @@

+# Machine Learning: A Beginner's Guide
+## What is Machine Learning?
+Machine learning is a subset of artificial intelligence where systems learn patterns from data rather than being explicitly programmed. Instead of writing rules, you provide examples and let the algorithm discover the rules.
+## Types of Machine Learning
+### Supervised Learning
+The algorithm learns from labeled examples.
+**Classification**: Predicting categories
+- Email spam detection
+- Image recognition
+- Medical diagnosis
+**Regression**: Predicting continuous values
+- House price prediction
+- Stock price forecasting
+- Temperature prediction
+Common algorithms:
+- Linear Regression
+- Logistic Regression
+- Decision Trees
+- Random Forests
+- Support Vector Machines (SVM)
+- Neural Networks
+### Unsupervised Learning
+The algorithm finds patterns in unlabeled data.
+**Clustering**: Grouping similar items
+- Customer segmentation
+- Document categorization
+- Anomaly detection
+**Dimensionality Reduction**: Simplifying data
+- Feature extraction
+- Visualization
+- Noise reduction
+Common algorithms:
+- K-Means Clustering
+- Hierarchical Clustering
+- Principal Component Analysis (PCA)
+- t-SNE
+### Reinforcement Learning
+The algorithm learns through trial and error, receiving rewards or penalties.
+Applications:
+- Game playing (AlphaGo, chess)
+- Robotics
+- Autonomous vehicles
+- Resource management
+## The Machine Learning Pipeline
+1. **Data Collection**: Gather relevant data
+2. **Data Cleaning**: Handle missing values, outliers
+3. **Feature Engineering**: Create useful features
+4. **Model Selection**: Choose appropriate algorithm
+5. **Training**: Fit model to training data
+6. **Evaluation**: Test on held-out data
+7. **Deployment**: Put model into production
+8. **Monitoring**: Track performance over time
+## Key Concepts
+### Overfitting vs Underfitting
+**Overfitting**: Model memorizes training data, performs poorly on new data
+- Solution: More data, regularization, simpler model
+**Underfitting**: Model too simple to capture patterns
+- Solution: More features, complex model, less regularization
+### Train/Test Split
+Never evaluate on training data. Common splits:
+- 80% training, 20% testing
+- 70% training, 15% validation, 15% testing
+### Cross-Validation
+K-fold cross-validation provides more robust evaluation:
+1. Split data into K folds
+2. Train on K-1 folds, test on remaining fold
+3. Repeat K times
+4. Average the results
+### Bias-Variance Tradeoff
+- **High Bias**: Oversimplified model (underfitting)
+- **High Variance**: Overcomplicated model (overfitting)
+- Goal: Find the sweet spot
+## Evaluation Metrics
+### Classification
+- Accuracy: Correct predictions / Total predictions
+- Precision: True positives / Predicted positives
+- Recall: True positives / Actual positives
+- F1 Score: Harmonic mean of precision and recall
+- AUC-ROC: Area under receiver operating curve
+### Regression
+- Mean Absolute Error (MAE)
+- Mean Squared Error (MSE)
+- Root Mean Squared Error (RMSE)
+- R-squared (R2)
+## Getting Started
+1. Learn Python and libraries (NumPy, Pandas, Scikit-learn)
+2. Work through classic datasets (Iris, MNIST, Titanic)
+3. Take online courses (Coursera, fast.ai)
+4. Practice on Kaggle competitions
+5. Build projects with real-world data
+Remember: Machine learning is 80% data preparation and 20% modeling. Start with clean data and simple models before going complex.