Buckets:
CropRL: Future Ideas & Enhancements
Ideas captured during implementation planning. These are not in scope for the MVP but the codebase is designed to accommodate them.
Task Design Ideas
Grader-Based Scoring (Post-MVP)
- Grader as separate script that accepts a trajectory JSON and returns 0.0–1.0
- Grader as reward function baked into the environment
- Need to decide which pattern fits OpenEnv hackathon expectations better
Task Variations (Beyond Easy/Medium/Hard Env Complexity)
- Survival Task: Score = 1.0 if survived all 60 steps, else steps_survived/60
- Profit Task: Score = clip(net_worth / target, 0, 1)
- Sustainability Task: Composite of profit (50%) + soil health maintenance (50%)
- Drought Challenge: High weather variance, agent must learn irrigation timing
- Debt Trap: Start with debt, agent must escape negative cash flow spiral
Task Difficulty Levers
| Lever | Easy | Medium | Hard |
|---|---|---|---|
| Interest rate | 0% | 8% | 12% |
| Initial cash | ₹15,000 | ₹10,000 | ₹7,000 |
| Initial soil N | 0.8 | 0.6 | 0.35 |
| Weather variance | σ=0.05 | σ=0.15 | σ=0.25 |
| Market volatility | σ=0.05 | σ=0.10 | σ=0.20 |
| Spoilage age | 12 months | 6 months | 3 months |
| Storage cost | ₹0/month | ₹0/month | ₹50/month |
Market Price Enhancements
Random Walk Model ✅ IMPLEMENTED
P_t = P_{t-1} * (1 + drift + ε)
drift = reversion_speed * (target - P_{t-1}) / target
target = base_price * seasonal_multiplier
ε ~ N(0, σ²), clamped to ±3σ
- Prices now autocorrelate month-to-month with mean reversion toward seasonal targets
- Configurable via
enable_price_autocorrelationandprice_reversion_speed - All prices clamped to
[1.0, base × price_max_multiplier]
Demand Shocks ✅ IMPLEMENTED
- With probability
demand_shock_probability(~8% per month ≈ 1/year), one random crop gets a price spike or dip of 30-60% - Direction is random (positive or negative)
- Creates opportunities for agents that maintain storage/optionality
Correlated Crop Prices
- When Corn price is high, Wheat tends to be mid, Chickpea low (substitute goods)
- Could model with a correlation matrix on the noise terms
Environment Extensions (Future Scope from RoughPlan)
Machinery System
Machinery_Health: 0.0–1.0, degrades with each physical action (Plant/Harvest/Irrigate)Repair_Machineryaction (ID 11): costscost_machinery_repair, restores health- If machinery_health < 0.2: yield penalty (equipment breaking down)
- If machinery_health = 0: cannot perform physical actions
Storage Costs
- Monthly cost proportional to stored amount
- Creates pressure to sell rather than hoard indefinitely
Dynamic Costs
- Seed/irrigation/fertilizer costs vary by month (supply chain effects)
- Could follow similar seasonal model as market prices
Action Masking
- Provide
valid_actions: list[int]in observation - Prevents agent from wasting steps on invalid actions
- Important for RL training efficiency (mask logits)
Multi-Field Farming
- Multiple land plots, each with independent crop/soil state
- Dramatically increases state/action space complexity
Observation Formats
Text-Friendly (for LLM agents)
Month: June (6) | Step: 12/60
Weather: Expected rainfall 0.72 (monsoon)
Farm: Growing Corn (age 3/4 months) | Soil N: 0.45
Finance: Cash ₹8,200 | Debt ₹5,000 | Interest: 11%
Markets: Corn ₹1,150/ton | Wheat ₹820/ton | Chickpea ₹490/ton
Storage: 5.2 tons of Wheat (age 2 months)
Numeric Vector (for RL policies)
[month_sin, month_cos, rainfall, crop_type_onehot(4), crop_age/6,
yield_potential, soil_n, cash/50000, debt/20000, interest_rate,
price_1/2000, price_2/2000, price_3/1000,
stored_type_onehot(4), stored_amount/10, stored_age/6]
Additional Stochastic Features (Future Scope)
Pest / Disease Events
- Each month, a growing crop has a small probability (~5%) of a pest event
- Damage: lose 20-80% of yield potential (uniform random)
- Creates "insurance" calculus: harvest early to lock in value vs gamble on maturity
- Could be modeled as a
crop_health_multiplierin state - Irrigated/fertilized crops could have lower pest probability
- Implementation: one
rng.random()check per month + one float in state
Stochastic Spoilage
- Instead of deterministic rot at max_storage_age, spoilage probability increases with age:
spoilage_prob = 0 if age <= safe_age else (age - safe_age) / (max_age - safe_age) - Removes the "safe to store for exactly N months" certainty
- Makes storage timing a genuine risk-reward tradeoff
- Implementation: change 3 lines in
apply_spoilage(), zero new state
Xet Storage Details
- Size:
- 4.76 kB
- Xet hash:
- ae0da1dd03bd2843b35fc34a4c722a7c1fbe924b1710e137d6edea44fb64239a
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.