Tracy André commited on
Commit
dc128e4
·
1 Parent(s): 7cf64b9
Files changed (17) hide show
  1. DATASET_CARD.md +0 -280
  2. IMPLEMENTATION_SUMMARY.md +0 -202
  3. MODEL_CARD.md +0 -296
  4. GOAL.md → PROMPT.md +36 -1
  5. README.md +86 -157
  6. analysis_tools.py +0 -368
  7. app.py +13 -224
  8. demo.py +0 -218
  9. gradio_app.py +0 -474
  10. hf_integration.py +0 -313
  11. hf_usage_example.py +0 -214
  12. launch.py +0 -170
  13. mcp.code-workspace +0 -11
  14. mcp_server.py +267 -404
  15. requirements.txt +5 -7
  16. test_data_sources.py +0 -190
  17. test_hf_only.py +0 -155
DATASET_CARD.md DELETED
@@ -1,280 +0,0 @@
1
- ---
2
- license: cc-by-4.0
3
- task_categories:
4
- - tabular-regression
5
- - time-series-forecasting
6
- language:
7
- - fr
8
- tags:
9
- - agriculture
10
- - herbicides
11
- - weed-pressure
12
- - crop-rotation
13
- - france
14
- - bretagne
15
- - sustainability
16
- - precision-agriculture
17
- - ift
18
- - treatment-frequency-index
19
- size_categories:
20
- - 1K<n<10K
21
- pretty_name: "Station Expérimentale de Kerguéhennec - Agricultural Interventions"
22
- configs:
23
- - config_name: default
24
- data_files:
25
- - split: train
26
- path: "*.csv"
27
- ---
28
-
29
- # 🚜 Station Expérimentale de Kerguéhennec - Agricultural Interventions Dataset
30
-
31
- ## Dataset Description
32
-
33
- This dataset contains comprehensive agricultural intervention records from the Station Expérimentale de Kerguéhennec in Brittany, France, spanning from 2014 to 2024. The data provides detailed insights into agricultural practices, crop rotations, herbicide treatments, and field management operations across 100 different plots.
34
-
35
- ## Dataset Summary
36
-
37
- - **Source**: Station Expérimentale de Kerguéhennec, Brittany, France
38
- - **Time Period**: 2014-2024 (10 years)
39
- - **Location**: Brittany (Bretagne), France
40
- - **Records**: 4,663 intervention records
41
- - **Plots**: 100 unique agricultural parcels
42
- - **Crops**: 42 different crop types
43
- - **Format**: CSV exports from farm management system
44
- - **Language**: French (field names and crop types)
45
-
46
- ## Primary Use Cases
47
-
48
- This dataset is particularly valuable for:
49
-
50
- 1. **🌿 Weed Pressure Analysis**: Calculate and predict Treatment Frequency Index (IFT) for herbicides
51
- 2. **🔄 Crop Rotation Optimization**: Analyze the impact of different crop sequences on pest pressure
52
- 3. **🌱 Sustainable Agriculture**: Support reduction of herbicide use while maintaining productivity
53
- 4. **🎯 Precision Agriculture**: Identify suitable plots for sensitive crops (peas, beans)
54
- 5. **📊 Agricultural Research**: Study relationships between farming practices and outcomes
55
- 6. **🤖 Machine Learning**: Train models for agricultural prediction and decision support
56
-
57
- ## Data Structure
58
-
59
- ### Core Fields
60
-
61
- | Field | Description | Type | Example |
62
- |-------|-------------|------|---------|
63
- | `millesime` | Year of intervention | Integer | 2024 |
64
- | `nomparc` | Plot/field name | String | "Etang Milieu" |
65
- | `surfparc` | Plot surface area (hectares) | Float | 2.28 |
66
- | `libelleusag` | Crop type/usage | String | "pois de conserve" |
67
- | `datedebut` | Intervention start date | Date | "20/2/24" |
68
- | `datefin` | Intervention end date | Date | "20/2/24" |
69
- | `libevenem` | Intervention type | String | "Semis classique" |
70
- | `familleprod` | Product family | String | "Herbicides" |
71
- | `produit` | Specific product used | String | "CALLISTO" |
72
- | `quantitetot` | Total quantity applied | Float | 1.5 |
73
- | `unite` | Unit of measurement | String | "L" |
74
-
75
- ### Derived Fields (Added During Processing)
76
-
77
- | Field | Description | Type |
78
- |-------|-------------|------|
79
- | `year` | Standardized year | Integer |
80
- | `crop_type` | Standardized crop classification | String |
81
- | `is_herbicide` | Boolean flag for herbicide treatments | Boolean |
82
- | `is_fungicide` | Boolean flag for fungicide treatments | Boolean |
83
- | `is_insecticide` | Boolean flag for insecticide treatments | Boolean |
84
- | `plot_name` | Standardized plot name | String |
85
- | `intervention_type` | Standardized intervention classification | String |
86
-
87
- ## Key Statistics
88
-
89
- ### Temporal Coverage
90
- - **Years**: 2014-2024 (missing 2017 due to data format issues)
91
- - **Seasons**: All agricultural seasons represented
92
- - **Frequency**: Multiple interventions per plot per year
93
-
94
- ### Spatial Coverage
95
- - **Plots**: 100 unique agricultural parcels
96
- - **Surface**: Variable plot sizes (0.43 to 5+ hectares)
97
- - **Location**: Single experimental station (controlled conditions)
98
-
99
- ### Intervention Types
100
- - **Herbicide applications**: 800+ treatments
101
- - **Total interventions**: 4,663 records
102
- - **Product families**: Herbicides, Fungicides, Insecticides, Fertilizers
103
- - **Most common crops**: Wheat, Corn, Rapeseed
104
-
105
- ## Treatment Frequency Index (IFT)
106
-
107
- ### Definition
108
- The IFT (Indice de Fréquence de Traitement) is a key metric calculated as:
109
- ```
110
- IFT = Number of applications / Plot surface area
111
- ```
112
-
113
- ### Interpretation
114
- - **IFT < 1.0**: Low weed pressure (suitable for sensitive crops)
115
- - **IFT 1.0-2.0**: Moderate pressure (monitoring required)
116
- - **IFT > 2.0**: High pressure (intervention needed)
117
-
118
- ### Dataset Statistics
119
- - **Mean IFT**: 1.93 (moderate pressure)
120
- - **Range**: 0.14 - 6.67
121
- - **Trend**: Decreasing from 2.91 (2014) to 1.74 (2024)
122
-
123
- ## Data Quality
124
-
125
- ### Completeness
126
- - **Core fields**: 95%+ completeness for essential variables
127
- - **Date fields**: Well-formatted and consistent
128
- - **Numeric fields**: Validated ranges and units
129
- - **Geographic data**: Anonymized but consistent plot identifiers
130
-
131
- ### Validation
132
- - **Cross-references**: Product codes validated against official databases
133
- - **Temporal consistency**: Logical intervention sequences
134
- - **Agronomic validity**: Realistic crop rotations and treatment patterns
135
-
136
- ### Limitations
137
- - **Geographic scope**: Single experimental station (limited geographic diversity)
138
- - **Weather data**: Not included (external source required)
139
- - **Economic data**: Treatment costs not provided
140
- - **Soil characteristics**: Limited soil type information
141
-
142
- ## Ethical Considerations
143
-
144
- ### Privacy Protection
145
- - **Location data**: Generalized to protect farm location
146
- - **Personal information**: All farmer identifying data removed
147
- - **Commercial sensitivity**: Product usage patterns aggregated when appropriate
148
-
149
- ### Bias Considerations
150
- - **Geographic bias**: Limited to Brittany region
151
- - **Temporal bias**: Recent years may have different practices
152
- - **Selection bias**: Experimental station may not represent typical farms
153
- - **Technology bias**: Practices may reflect research station capabilities
154
-
155
- ## Applications
156
-
157
- ### 1. Weed Pressure Prediction
158
- Use machine learning models to predict future IFT values based on:
159
- - Historical treatment patterns
160
- - Crop rotation sequences
161
- - Environmental factors
162
- - Plot characteristics
163
-
164
- **Example Model Performance**:
165
- - Random Forest Regressor: R² = 0.65-0.85
166
- - Features: Year, plot surface, previous IFT, crop type, rotation sequence
167
-
168
- ### 2. Sustainable Plot Selection
169
- Identify plots suitable for sensitive crops (peas, beans) by:
170
- - Analyzing historical IFT trends
171
- - Evaluating rotation impacts
172
- - Assessing risk levels for future years
173
-
174
- ### 3. Crop Rotation Optimization
175
- Optimize rotation sequences through:
176
- - Impact analysis of different crop sequences
177
- - Identification of beneficial rotations
178
- - Risk assessment for specific transitions
179
-
180
- **Best Rotations (Lowest IFT)**:
181
- 1. Peas → Rapeseed: IFT 0.62
182
- 2. Winter Barley → Rapeseed: IFT 0.64
183
- 3. Corn → Spring Barley: IFT 0.69
184
-
185
- ### 4. Herbicide Alternative Analysis
186
- Support reduction strategies through:
187
- - Product usage pattern analysis
188
- - Temporal trend identification
189
- - Alternative strategy development
190
-
191
- ## Code Examples
192
-
193
- ### Loading the Dataset
194
- ```python
195
- from datasets import load_dataset
196
-
197
- # Load the dataset
198
- dataset = load_dataset("HackathonCRA/2024")
199
-
200
- # Convert to pandas for analysis
201
- import pandas as pd
202
- df = dataset["train"].to_pandas()
203
-
204
- print(f"Loaded {len(df)} intervention records")
205
- print(f"Covering {df['year'].nunique()} years")
206
- ```
207
-
208
- ### Calculate IFT
209
- ```python
210
- # Calculate IFT for herbicide applications
211
- herbicides = df[df['familleprod'].str.contains('Herbicides', na=False)]
212
-
213
- ift_data = herbicides.groupby(['plot_name', 'year', 'crop_type']).agg({
214
- 'quantitetot': 'sum',
215
- 'produit': 'count', # Number of applications
216
- 'surfparc': 'first'
217
- }).reset_index()
218
-
219
- ift_data['ift'] = ift_data['produit'] / ift_data['surfparc']
220
- ```
221
-
222
- ### Analyze Crop Rotations
223
- ```python
224
- # Create rotation sequences
225
- rotations = []
226
- for plot in df['plot_name'].unique():
227
- plot_data = df[df['plot_name'] == plot].sort_values('year')
228
- crops = plot_data.groupby('year')['crop_type'].first()
229
-
230
- for i in range(len(crops)-1):
231
- rotation = f"{crops.iloc[i]} → {crops.iloc[i+1]}"
232
- rotations.append({
233
- 'plot': plot,
234
- 'year_from': crops.index[i],
235
- 'year_to': crops.index[i+1],
236
- 'rotation': rotation
237
- })
238
-
239
- rotation_df = pd.DataFrame(rotations)
240
- ```
241
-
242
- ## Related Datasets
243
-
244
- - **Weather Data**: Consider integrating with Météo-France data for enhanced analysis
245
- - **Soil Data**: European Soil Database for soil type information
246
- - **Economic Data**: Agricultural input cost databases
247
- - **Regulatory Data**: AMM (Marketing Authorization) product databases
248
-
249
- ## Citation
250
-
251
- If you use this dataset in your research, please cite:
252
-
253
- ```bibtex
254
- @dataset{hackathon_cra_2024,
255
- title={Station Expérimentale de Kerguéhennec Agricultural Interventions Dataset},
256
- author={Hackathon CRA Team},
257
- year={2024},
258
- publisher={Hugging Face},
259
- url={https://huggingface.co/datasets/HackathonCRA/2024},
260
- note={Agricultural intervention data from Brittany, France (2014-2024)}
261
- }
262
- ```
263
-
264
- ## License
265
-
266
- This dataset is released under CC-BY-4.0 license, allowing for both commercial and research use with proper attribution.
267
-
268
- ## Updates and Versioning
269
-
270
- - **Version 1.0**: Initial release with 2014-2024 data
271
- - **Future versions**: May include additional years or enhanced metadata
272
- - **Quality improvements**: Ongoing validation and cleaning
273
-
274
- ## Contact
275
-
276
- For questions about this dataset, collaboration opportunities, or data corrections, please use the Hugging Face dataset discussion feature or contact the research team through the repository.
277
-
278
- ---
279
-
280
- **Keywords**: agriculture, herbicides, crop rotation, sustainable farming, France, Brittany, IFT, weed management, precision agriculture, time series, regression, treatment frequency
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
IMPLEMENTATION_SUMMARY.md DELETED
@@ -1,202 +0,0 @@
1
- # 🚜 Agricultural Analysis Tool - Implementation Summary
2
-
3
- ## ✅ Successfully Implemented
4
-
5
- ### 🎯 Project Objectives - COMPLETED
6
- - ✅ **Weed pressure prediction** for next 3 years using machine learning
7
- - ✅ **Plot identification** for sensitive crops (peas, beans)
8
- - ✅ **IFT analysis** (Treatment Frequency Index) for herbicide usage
9
- - ✅ **Crop rotation impact** analysis on weed pressure
10
- - ✅ **Historical data integration** from Station Expérimentale de Kerguéhennec (2014-2024)
11
- - ✅ **Herbicide alternative analysis** and usage patterns
12
-
13
- ### 🏗️ Technical Architecture - COMPLETED
14
-
15
- #### 1. **MCP Server** (`mcp_server.py`)
16
- - ✅ Model Context Protocol compliant server
17
- - ✅ 7 tools for data analysis and filtering
18
- - ✅ 6 resources for data access
19
- - ✅ JSON-based responses for LLM integration
20
- - ✅ Error handling and logging
21
-
22
- #### 2. **Data Processing** (`data_loader.py`)
23
- - ✅ Loads 10+ CSV/Excel files automatically
24
- - ✅ Handles mixed data formats (CSV + Excel)
25
- - ✅ Data preprocessing and cleaning
26
- - ✅ Derived metrics calculation (IFT, crop types, etc.)
27
- - ✅ Caching for performance
28
-
29
- #### 3. **Analysis Engine** (`analysis_tools.py`)
30
- - ✅ Statistical analysis of intervention data
31
- - ✅ Random Forest prediction model for weed pressure
32
- - ✅ Interactive Plotly visualizations
33
- - ✅ Crop rotation sequence analysis
34
- - ✅ Risk level classification (low/medium/high)
35
-
36
- #### 4. **Gradio Interface** (`gradio_app.py`)
37
- - ✅ 6-tab interactive web interface
38
- - ✅ Real-time filtering and analysis
39
- - ✅ Interactive plots and visualizations
40
- - ✅ Export capabilities
41
- - ✅ User-friendly French interface
42
-
43
- #### 5. **Hugging Face Integration** (`hf_integration.py`, `app.py`)
44
- - ✅ HF Spaces deployment configuration
45
- - ✅ Dataset upload functionality
46
- - ✅ Environment variable management
47
- - ✅ Production-ready app entry point
48
-
49
- ### 📊 Data Analysis Results
50
-
51
- #### **Dataset Statistics**
52
- - **Records processed**: 4,663 interventions
53
- - **Time period**: 2014-2024 (10 years)
54
- - **Plots analyzed**: 100 unique parcels
55
- - **Crop types**: 42 different crops
56
- - **Herbicide applications**: 800+ treatments
57
-
58
- #### **Key Findings**
59
- - **Average IFT**: 1.93 (moderate weed pressure)
60
- - **IFT trends**: Decreasing from 2.91 (2014) to 1.74 (2024)
61
- - **Best rotations**: pois → colza (IFT: 0.62), orge → colza (IFT: 0.64)
62
- - **Worst rotations**: colza → triticale (IFT: 2.79)
63
- - **Top herbicides**: BISCOTO, CALLISTO, PRIMUS
64
-
65
- ### 🔧 Tools and Features
66
-
67
- #### **MCP Tools Available**
68
- 1. `filter_data` - Filter by years, plots, crops, interventions
69
- 2. `analyze_weed_pressure` - IFT analysis with visualizations
70
- 3. `predict_weed_pressure` - ML predictions for 2025-2027
71
- 4. `identify_suitable_plots` - Find plots for sensitive crops
72
- 5. `analyze_crop_rotation` - Rotation impact analysis
73
- 6. `analyze_herbicide_alternatives` - Product usage patterns
74
- 7. `get_data_statistics` - Comprehensive data summaries
75
-
76
- #### **Gradio Interface Tabs**
77
- 1. **📊 Aperçu** - Data overview and statistics
78
- 2. **🔍 Filtrage** - Interactive data filtering
79
- 3. **🌿 Pression Adventices** - Weed pressure analysis
80
- 4. **🔮 Prédictions** - ML-based predictions
81
- 5. **🔄 Rotations** - Crop rotation analysis
82
- 6. **💊 Herbicides** - Product usage analysis
83
-
84
- ### 🚀 Deployment Options
85
-
86
- #### **Local Development**
87
- ```bash
88
- # Quick start
89
- python launch.py
90
-
91
- # Individual components
92
- python gradio_app.py # Web interface
93
- python mcp_server.py # MCP server
94
- python demo.py # Demo script
95
- ```
96
-
97
- #### **Hugging Face Spaces**
98
- ```bash
99
- python app.py # HF-compatible launcher
100
- ```
101
-
102
- #### **Docker/Cloud**
103
- - All dependencies in `requirements.txt`
104
- - Environment variables configured
105
- - Production-ready settings
106
-
107
- ### 📈 Performance Metrics
108
-
109
- #### **Model Performance**
110
- - **R² Score**: 0.65-0.85 (varies by data split)
111
- - **Prediction accuracy**: Good for identifying trends
112
- - **Processing speed**: < 2 seconds for full analysis
113
- - **Memory usage**: < 500MB for full dataset
114
-
115
- #### **System Performance**
116
- - **Data loading**: < 5 seconds for all files
117
- - **Analysis completion**: < 10 seconds
118
- - **Visualization generation**: < 3 seconds
119
- - **Web interface response**: < 1 second
120
-
121
- ### 🎯 Business Impact
122
-
123
- #### **For Farmers**
124
- - ✅ **Reduced herbicide usage** through targeted application
125
- - ✅ **Optimized crop placement** on suitable plots
126
- - ✅ **Improved rotation planning** based on data insights
127
- - ✅ **Risk assessment** for sensitive crops
128
-
129
- #### **For Agricultural Advisors**
130
- - ✅ **Data-driven recommendations** with historical backing
131
- - ✅ **Visual analysis tools** for client presentations
132
- - ✅ **Comparative analysis** across plots and years
133
- - ✅ **Regulatory compliance** tracking (IFT monitoring)
134
-
135
- #### **For Researchers**
136
- - ✅ **Comprehensive dataset** for further research
137
- - ✅ **Reproducible analysis** methods
138
- - ✅ **ML model** for extension to other regions
139
- - ✅ **Open source tools** for collaboration
140
-
141
- ### 🌍 Environmental Benefits
142
-
143
- - **Herbicide reduction**: Targeted application reduces overall usage
144
- - **Biodiversity protection**: Lower chemical pressure on ecosystems
145
- - **Soil health**: Optimized rotations improve soil structure
146
- - **Water quality**: Reduced runoff from excess treatments
147
-
148
- ### 📋 Next Steps and Extensions
149
-
150
- #### **Immediate Enhancements**
151
- 1. **Weather data integration** for improved predictions
152
- 2. **Soil type classification** for more precise recommendations
153
- 3. **Economic analysis** (cost vs. benefit of treatments)
154
- 4. **Mobile app development** for field use
155
-
156
- #### **Advanced Features**
157
- 1. **Real-time monitoring** with IoT sensors
158
- 2. **Satellite imagery** integration for precision agriculture
159
- 3. **AI-powered recommendations** using larger language models
160
- 4. **Multi-farm analysis** for regional insights
161
-
162
- #### **Research Opportunities**
163
- 1. **Climate change impact** modeling
164
- 2. **Resistance development** tracking
165
- 3. **Biodiversity indicators** integration
166
- 4. **Carbon footprint** assessment
167
-
168
- ## 🏆 Project Success Metrics
169
-
170
- ### ✅ All Objectives Met
171
- - **Functional MCP Server**: ✅ 100% operational
172
- - **Gradio Interface**: ✅ Fully interactive
173
- - **Data Analysis**: ✅ Comprehensive insights
174
- - **Prediction Model**: ✅ Working with good accuracy
175
- - **HF Compatibility**: ✅ Ready for deployment
176
- - **Documentation**: ✅ Complete with examples
177
-
178
- ### 📊 Technical Achievements
179
- - **Code Quality**: Clean, modular, well-documented
180
- - **Performance**: Fast, efficient, scalable
181
- - **User Experience**: Intuitive, visual, informative
182
- - **Deployment**: Multiple options, production-ready
183
-
184
- ### 🎯 Business Value
185
- - **Actionable Insights**: Clear recommendations for farmers
186
- - **Cost Reduction**: Optimized herbicide usage
187
- - **Risk Mitigation**: Better crop placement decisions
188
- - **Compliance**: IFT tracking for regulations
189
-
190
- ---
191
-
192
- ## 🚀 Ready for Production
193
-
194
- The Agricultural Analysis Tool is **production-ready** with:
195
-
196
- - ✅ **Stable codebase** with error handling
197
- - ✅ **Comprehensive testing** via demo script
198
- - ✅ **Multiple deployment options** (local, cloud, HF)
199
- - ✅ **Complete documentation** and examples
200
- - ✅ **Scalable architecture** for future enhancements
201
-
202
- **🎉 Project completed successfully for the CRA Hackathon!**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
MODEL_CARD.md DELETED
@@ -1,296 +0,0 @@
1
- ---
2
- license: cc-by-4.0
3
- library_name: scikit-learn
4
- pipeline_tag: tabular-regression
5
- tags:
6
- - agriculture
7
- - herbicides
8
- - weed-pressure
9
- - crop-rotation
10
- - time-series-forecasting
11
- - sustainability
12
- - random-forest
13
- datasets:
14
- - HackathonCRA/2024
15
- language:
16
- - fr
17
- base_model: null
18
- model-index:
19
- - name: Agricultural Weed Pressure Predictor
20
- results:
21
- - task:
22
- type: tabular-regression
23
- name: Treatment Frequency Index Prediction
24
- dataset:
25
- name: Station Expérimentale de Kerguéhennec
26
- type: HackathonCRA/2024
27
- metrics:
28
- - name: R² Score
29
- type: r2_score
30
- value: 0.75
31
- - name: Mean Squared Error
32
- type: mean_squared_error
33
- value: 0.42
34
- - name: Mean Absolute Error
35
- type: mean_absolute_error
36
- value: 0.51
37
- ---
38
-
39
- # 🚜 Agricultural Weed Pressure Predictor
40
-
41
- ## Model Description
42
-
43
- This Random Forest regression model predicts the Treatment Frequency Index (IFT) for herbicide applications in agricultural plots, specifically designed to help farmers in Brittany, France optimize their weed management strategies and identify suitable plots for sensitive crops like peas and beans.
44
-
45
- ## Model Details
46
-
47
- ### Architecture
48
- - **Model Type**: Random Forest Regressor
49
- - **Framework**: scikit-learn
50
- - **Target Variable**: IFT (Treatment Frequency Index) for herbicides
51
- - **Prediction Horizon**: 1-3 years ahead (2025-2027)
52
- - **Input Features**: 15+ engineered features
53
-
54
- ### Training Details
55
- - **Training Data**: 10 years of agricultural intervention records (2014-2024)
56
- - **Source**: Station Expérimentale de Kerguéhennec, Brittany, France
57
- - **Records**: 4,663 intervention records across 100 plots
58
- - **Validation**: Temporal split (train on 2014-2022, validate on 2023-2024)
59
-
60
- ## Intended Use
61
-
62
- ### Primary Use Cases
63
- 1. **🎯 Plot Selection**: Identify plots suitable for sensitive crops (IFT < 1.0)
64
- 2. **📊 Weed Pressure Forecasting**: Predict future herbicide requirements
65
- 3. **🌱 Sustainable Agriculture**: Support herbicide reduction strategies
66
- 4. **🔄 Rotation Planning**: Optimize crop sequences for reduced weed pressure
67
-
68
- ### Target Users
69
- - **Farmers**: Decision support for crop placement and rotation planning
70
- - **Agricultural Advisors**: Data-driven recommendations for clients
71
- - **Researchers**: Analysis of farming practice impacts
72
- - **Policy Makers**: Assessment of sustainable agriculture initiatives
73
-
74
- ## Model Performance
75
-
76
- ### Evaluation Metrics
77
- - **R² Score**: 0.75 (explains 75% of variance in IFT)
78
- - **Mean Squared Error**: 0.42
79
- - **Mean Absolute Error**: 0.51
80
- - **RMSE**: 0.65
81
-
82
- ### Performance by Risk Category
83
- | Risk Level | Precision | Recall | F1-Score |
84
- |------------|-----------|--------|----------|
85
- | Low (IFT < 1.0) | 0.82 | 0.78 | 0.80 |
86
- | Medium (1.0-2.0) | 0.71 | 0.74 | 0.72 |
87
- | High (IFT > 2.0) | 0.69 | 0.67 | 0.68 |
88
-
89
- ### Feature Importance
90
- 1. **Previous IFT** (0.35) - Historical weed pressure
91
- 2. **Crop Type** (0.28) - Current crop being grown
92
- 3. **Rotation Sequence** (0.18) - Previous crop type
93
- 4. **Plot Surface** (0.12) - Size of the agricultural plot
94
- 5. **Year Trend** (0.07) - Temporal evolution patterns
95
-
96
- ## Features
97
-
98
- ### Input Variables
99
- - **Temporal**: Year, seasonal trends
100
- - **Spatial**: Plot identifier, surface area
101
- - **Agronomic**: Current crop, previous crop, rotation type
102
- - **Historical**: Previous IFT values, treatment trends
103
- - **Derived**: Rotation sequences, trend indicators
104
-
105
- ### Feature Engineering
106
- ```python
107
- # Example feature creation
108
- features['prev_ift'] = grouped_data['ift'].shift(1)
109
- features['crop_rotation'] = prev_crop + ' → ' + current_crop
110
- features['ift_trend'] = features['ift'].rolling(3).apply(lambda x: np.polyfit(range(3), x, 1)[0])
111
- ```
112
-
113
- ## Training Procedure
114
-
115
- ### Data Preprocessing
116
- 1. **Temporal Aggregation**: Group interventions by plot-year-crop
117
- 2. **IFT Calculation**: `IFT = applications / plot_surface`
118
- 3. **Feature Engineering**: Create rotation sequences and trends
119
- 4. **Categorical Encoding**: One-hot encoding for crops and plots
120
- 5. **Normalization**: StandardScaler for numerical features
121
-
122
- ### Model Training
123
- ```python
124
- from sklearn.ensemble import RandomForestRegressor
125
- from sklearn.model_selection import TimeSeriesSplit
126
-
127
- model = RandomForestRegressor(
128
- n_estimators=100,
129
- max_depth=10,
130
- min_samples_split=5,
131
- min_samples_leaf=2,
132
- random_state=42
133
- )
134
-
135
- # Temporal cross-validation
136
- tscv = TimeSeriesSplit(n_splits=5)
137
- model.fit(X_train, y_train)
138
- ```
139
-
140
- ### Hyperparameters
141
- - **n_estimators**: 100 trees
142
- - **max_depth**: 10 levels
143
- - **min_samples_split**: 5 samples
144
- - **min_samples_leaf**: 2 samples
145
- - **random_state**: 42 (reproducibility)
146
-
147
- ## Evaluation
148
-
149
- ### Validation Strategy
150
- - **Temporal Split**: Train on 2014-2022, test on 2023-2024
151
- - **Cross-validation**: 5-fold time series cross-validation
152
- - **Holdout**: 20% of most recent data reserved for final evaluation
153
-
154
- ### Performance Analysis
155
- The model performs best for:
156
- - ✅ **Stable rotations**: Well-established crop sequences
157
- - ✅ **Medium-sized plots**: 1-5 hectare plots
158
- - ✅ **Common crops**: Wheat, corn, rapeseed
159
-
160
- Challenges with:
161
- - ⚠️ **New crop varieties**: Limited training examples
162
- - ⚠️ **Extreme weather years**: Unusual climatic conditions
163
- - ⚠️ **Very small/large plots**: Edge cases in plot sizes
164
-
165
- ## Limitations and Biases
166
-
167
- ### Geographic Limitations
168
- - **Single Location**: Trained only on Brittany data
169
- - **Climate Specificity**: Oceanic climate conditions
170
- - **Soil Types**: Limited soil variety representation
171
-
172
- ### Temporal Limitations
173
- - **Recent Data Bias**: Model may not capture long-term cycles
174
- - **Technology Evolution**: Changing agricultural practices over time
175
- - **Climate Change**: Shifting baseline conditions
176
-
177
- ### Agricultural Limitations
178
- - **Experimental Station**: May not represent typical farms
179
- - **Crop Varieties**: Limited to varieties grown at the station
180
- - **Management Practices**: Research station vs. commercial practices
181
-
182
- ### Algorithmic Biases
183
- - **Historical Bias**: Perpetuates past treatment patterns
184
- - **Sampling Bias**: Overrepresentation of certain crops/rotations
185
- - **Measurement Bias**: IFT calculation methodology assumptions
186
-
187
- ## Ethical Considerations
188
-
189
- ### Environmental Impact
190
- - **Positive**: Supports herbicide reduction strategies
191
- - **Risk**: Over-reliance on predictions might ignore local conditions
192
- - **Mitigation**: Always combine with expert agronomic advice
193
-
194
- ### Economic Implications
195
- - **Farmers**: Could affect income through crop choice recommendations
196
- - **Industry**: May influence herbicide market demand
197
- - **Policy**: Could inform agricultural subsidy decisions
198
-
199
- ### Responsible Use
200
- - **Expert Validation**: Predictions should be validated by agronomists
201
- - **Local Adaptation**: Model outputs need local context consideration
202
- - **Continuous Monitoring**: Regular model performance assessment
203
-
204
- ## How to Use
205
-
206
- ### Installation
207
- ```bash
208
- pip install scikit-learn pandas numpy
209
- ```
210
-
211
- ### Basic Usage
212
- ```python
213
- from analysis_tools import AgriculturalAnalyzer
214
- from data_loader import AgriculturalDataLoader
215
-
216
- # Initialize components
217
- data_loader = AgriculturalDataLoader()
218
- analyzer = AgriculturalAnalyzer(data_loader)
219
-
220
- # Make predictions
221
- predictions = analyzer.predict_weed_pressure(
222
- target_years=[2025, 2026, 2027]
223
- )
224
-
225
- # Identify suitable plots
226
- suitable_plots = analyzer.identify_suitable_plots_for_sensitive_crops(
227
- target_years=[2025, 2026, 2027],
228
- max_ift_threshold=1.0
229
- )
230
- ```
231
-
232
- ### API Integration
233
- The model is available through the MCP (Model Context Protocol) server:
234
- ```python
235
- # Via MCP server
236
- tool_result = await mcp_client.call_tool(
237
- "predict_weed_pressure",
238
- {"target_years": [2025, 2026, 2027]}
239
- )
240
- ```
241
-
242
- ## Model Updates
243
-
244
- ### Version History
245
- - **v1.0**: Initial release with 2014-2024 data
246
- - **Future**: Regular updates with new seasonal data
247
-
248
- ### Retraining Schedule
249
- - **Annual**: Incorporate new year's intervention data
250
- - **Seasonal**: Adjust for significant practice changes
251
- - **Performance-based**: Retrain when accuracy drops below threshold
252
-
253
- ## Validation in Production
254
-
255
- ### Monitoring Metrics
256
- - **Prediction Accuracy**: Compare with actual IFT values
257
- - **User Feedback**: Farmer success with recommendations
258
- - **Agronomic Validation**: Expert review of predictions
259
-
260
- ### Performance Thresholds
261
- - **R² Score**: Maintain > 0.70
262
- - **MAE**: Keep < 0.60
263
- - **False Positive Rate**: < 15% for low-risk classifications
264
-
265
- ## Carbon Footprint
266
-
267
- ### Training Emissions
268
- - **Computing**: Minimal due to small dataset size (~1kg CO2)
269
- - **Data Storage**: Negligible impact
270
- - **Total Estimated**: < 2kg CO2 equivalent
271
-
272
- ### Positive Environmental Impact
273
- - **Herbicide Reduction**: Potential 10-20% reduction in applications
274
- - **Optimized Farming**: More efficient resource use
275
- - **Sustainable Practices**: Support for ecological agriculture
276
-
277
- ## Citation
278
-
279
- ```bibtex
280
- @model{agricultural_weed_predictor_2024,
281
- title={Agricultural Weed Pressure Predictor for Brittany Region},
282
- author={Hackathon CRA Team},
283
- year={2024},
284
- publisher={Hugging Face},
285
- url={https://huggingface.co/spaces/USERNAME/agricultural-analysis},
286
- note={Random Forest model for predicting herbicide Treatment Frequency Index}
287
- }
288
- ```
289
-
290
- ## Contact
291
-
292
- For questions about the model, improvements, or collaboration opportunities, please use the Hugging Face Space discussions or contact the development team.
293
-
294
- ---
295
-
296
- **Developed for sustainable agriculture in Brittany, France** 🌱
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
GOAL.md → PROMPT.md RENAMED
@@ -59,4 +59,39 @@ Concevoir et implémenter un serveur MCP conforme aux objectifs ci-dessus.
59
 
60
  Exposer ce serveur via une interface Gradio, compatible avec Hugging Face.
61
 
62
- Fournir des tools et resources exploitables par un LLM, permettant d’effectuer des analyses fiables, visuelles et interactives.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
  Exposer ce serveur via une interface Gradio, compatible avec Hugging Face.
61
 
62
+ Fournir des tools et resources exploitables par un LLM, permettant d’effectuer des analyses fiables, visuelles et interactives.
63
+
64
+
65
+
66
+ Voici de la documentation pour faire des mcp avec gradio :
67
+ - https://www.gradio.app/guides/building-mcp-server-with-gradio
68
+ - https://huggingface.co/blog/gradio-mcp
69
+
70
+ Voici un exemple de MCP qui fonctionne actuellement :
71
+ import gradio as gr
72
+
73
+ ```
74
+ def letter_counter(word, letter):
75
+ """Count the occurrences of a specific letter in a word.
76
+
77
+ Args:
78
+ word: The word or phrase to analyze
79
+ letter: The letter to count occurrences of
80
+
81
+ Returns:
82
+ The number of times the letter appears in the word
83
+ """
84
+ return word.lower().count(letter.lower())
85
+
86
+ demo = gr.Interface(
87
+ fn=letter_counter,
88
+ inputs=["text", "text"],
89
+ outputs="number",
90
+ title="Letter Counter",
91
+ description="Count how many times a letter appears in a word"
92
+ )
93
+
94
+ demo.launch(mcp_server=True)
95
+ ```
96
+
97
+ Appuies toi sur cette documentation pour produire ce MCP, au plus simple et efficace pour avoir un produit fonctionnel.
README.md CHANGED
@@ -1,180 +1,109 @@
1
- ---
2
- title: Agricultural Analysis - Kerguéhennec
3
- emoji: 🚜
4
- colorFrom: green
5
- colorTo: blue
6
- sdk: gradio
7
- sdk_version: 4.25.0
8
- app_file: app.py
9
- pinned: false
10
- license: cc-by-4.0
11
- language:
12
- - fr
13
- tags:
14
- - agriculture
15
- - herbicides
16
- - weed-pressure
17
- - crop-rotation
18
- - france
19
- - bretagne
20
- - sustainability
21
- - precision-agriculture
22
- - machine-learning
23
- - time-series
24
- datasets:
25
- - HackathonCRA/2024
26
- library_name: gradio
27
- pipeline_tag: tabular-regression
28
- ---
29
-
30
- # 🚜 Analyse Agricole - Station de Kerguéhennec
31
-
32
- ## Vue d'ensemble
33
-
34
- Outil d'analyse des données agricoles développé pour le hackathon CRA, permettant d'anticiper et réduire la pression des adventices dans les parcelles agricoles bretonnes. L'outil s'appuie sur l'analyse des données historiques d'interventions pour identifier les parcelles les plus adaptées aux cultures sensibles (pois, haricot).
35
-
36
- ## 🎯 Objectifs
37
-
38
- - **Prédire la pression adventices** sur chaque parcelle pour les 3 prochaines campagnes
39
- - **Identifier les parcelles à faible risque** adaptées aux cultures sensibles
40
- - **Analyser l'impact des rotations** culturales sur la pression adventices
41
- - **Proposer des alternatives** en cas de retrait de certaines molécules herbicides
42
-
43
- ## 📊 Données
44
-
45
- ### Source des données
46
- - **Station Expérimentale de Kerguéhennec** (Bretagne, France)
47
- - **Période**: 2014-2024 (10 années)
48
- - **Volume**: 4,663 enregistrements d'interventions
49
- - **Couverture**: 100 parcelles, 42 types de cultures
50
-
51
- ### Métriques clés
52
- - **IFT moyen**: 1.93 (pression modérée)
53
- - **Applications herbicides**: 800+ traitements analysés
54
- - **Évolution**: Diminution de l'IFT de 2.91 (2014) à 1.74 (2024)
55
-
56
- ## 🔧 Fonctionnalités
57
-
58
- ### 1. Analyse de la Pression Adventices
59
- - Calcul de l'IFT (Indice de Fréquence de Traitement)
60
- - Visualisations interactives des tendances
61
- - Classification des risques (faible/moyen/élevé)
62
-
63
- ### 2. Prédictions Machine Learning
64
- - Modèle Random Forest pour prédire l'IFT 2025-2027
65
- - R² Score: 0.65-0.85
66
- - Identification automatique des parcelles adaptées
67
-
68
- ### 3. Analyse des Rotations
69
- - Impact des séquences culturales sur la pression adventices
70
- - Identification des meilleures rotations
71
- - Recommandations d'optimisation
72
-
73
- ### 4. Interface Interactive
74
- - 6 onglets d'analyse spécialisés
75
- - Filtrage en temps réel
76
- - Visualisations Plotly interactives
77
- - Export des résultats
78
-
79
- ## 🚀 Utilisation
80
-
81
- ### Interface Web
82
- 1. Sélectionnez l'onglet correspondant à votre analyse
83
- 2. Configurez les filtres (années, parcelles, cultures)
84
- 3. Lancez l'analyse pour obtenir les résultats
85
- 4. Explorez les visualisations interactives
86
-
87
- ### Onglets disponibles
88
- - **📊 Aperçu**: Vue d'ensemble des données
89
- - **🔍 Filtrage**: Exploration interactive
90
- - **🌿 Pression Adventices**: Analyse IFT
91
- - **🔮 Prédictions**: Modèle prédictif ML
92
- - **🔄 Rotations**: Impact des rotations
93
- - **💊 Herbicides**: Analyse des produits
94
-
95
- ## 🧮 Méthodologie
96
-
97
- ### Calcul de l'IFT
98
- ```
99
- IFT = Nombre d'applications / Surface de la parcelle
100
- ```
101
 
102
- ### Seuils d'interprétation
103
- - **IFT < 1.0**: Pression faible (adapté cultures sensibles)
104
- - **IFT 1.0-2.0**: Pression modérée (surveillance nécessaire)
105
- - **IFT > 2.0**: Pression élevée (intervention requise)
106
 
107
- ### Modèle Prédictif
108
- - **Algorithme**: Random Forest Regressor
109
- - **Variables**: Année, surface, IFT historique, culture, rotation
110
- - **Validation**: Division temporelle des données
111
 
112
- ## 📈 Résultats Clés
 
 
 
113
 
114
- ### Rotations Optimales
115
- 1. **Pois Colza**: IFT 0.62 (excellent)
116
- 2. **Orge Colza**: IFT 0.64 (très bon)
117
- 3. **Maïs → Orge**: IFT 0.69 (bon)
118
 
119
- ### Herbicides Principaux
120
- 1. **BISCOTO** (blé): 21 applications
121
- 2. **CALLISTO** (maïs): 20 applications
122
- 3. **PRIMUS** (blé): 20 applications
123
 
124
- ### Parcelles Recommandées (IFT < 1.0)
125
- Identification automatique des parcelles les plus adaptées aux cultures sensibles pour les années 2025-2027.
 
 
126
 
127
- ## 🌍 Impact Environnemental
128
 
129
- - **Réduction herbicides**: Application ciblée basée sur les données
130
- - **Protection biodiversité**: Diminution de la pression chimique
131
- - **Santé des sols**: Rotations optimisées
132
- - **Qualité de l'eau**: Réduction du ruissellement
133
 
134
- ## 🏆 Architecture Technique
 
 
 
 
 
 
135
 
136
- ### Composants
137
- - **Serveur MCP**: Protocol Model Context pour intégration LLM
138
- - **Interface Gradio**: Application web interactive
139
- - **Moteur d'analyse**: Machine Learning et statistiques
140
- - **Intégration HF**: Déploiement et partage de données
 
 
 
 
 
 
141
 
142
- ### Performance
143
- - **Chargement données**: < 5 secondes
144
- - **Analyse complète**: < 10 secondes
145
- - **Génération graphiques**: < 3 secondes
146
- - **Réponse interface**: < 1 seconde
147
 
148
- ## 📚 Documentation
149
 
150
- ### Guide d'utilisation
151
- Chaque onglet contient des instructions intégrées et des exemples d'utilisation.
152
 
153
- ### API et outils
154
- - 7 outils d'analyse via serveur MCP
155
- - 6 ressources de données structurées
156
- - Format JSON pour intégration
157
 
158
- ## 🤝 Contribution
159
 
160
- Développé pour le hackathon CRA dans le but d'aider les agriculteurs bretons à optimiser leurs pratiques phytosanitaires.
161
 
162
- ### Équipe
163
- - Analyse des données agricoles
164
- - Développement d'outils d'aide à la décision
165
- - Interface utilisateur intuitive
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
166
 
167
- ## 📞 Support
168
 
169
- Pour questions techniques ou suggestions d'amélioration, utilisez les fonctionnalités de discussion de l'espace Hugging Face.
 
 
 
 
170
 
171
- ---
172
 
173
- **Développé avec ❤️ pour l'agriculture bretonne et la réduction des pesticides**
174
 
175
- ## 🔗 Liens Utiles
176
 
177
- - [Documentation complète](README.md)
178
- - [Code source](https://huggingface.co/spaces/USERNAME/agricultural-analysis/tree/main)
179
- - [Dataset utilisé](https://huggingface.co/datasets/HackathonCRA/2024)
180
- - [Guide méthodologique](IMPLEMENTATION_SUMMARY.md)
 
1
+ # 🚜 Hackathon CRA - Analyse Pression Adventices
2
+
3
+ ## 🎯 Objectif
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ Serveur MCP (Model Context Protocol) pour anticiper et réduire la pression des adventices dans les parcelles agricoles bretonnes, en s'appuyant sur l'analyse des données historiques de la Station Expérimentale de Kerguéhennec (2014-2024).
 
 
 
6
 
7
+ ## 🔍 Fonctionnalités
 
 
 
8
 
9
+ ### 📈 Analyse des Tendances IFT
10
+ - Calcul de l'Indice de Fréquence de Traitement (IFT) herbicides
11
+ - Évolution temporelle par parcelle et par culture
12
+ - Filtrage par période et parcelle
13
 
14
+ ### 🔮 Prédictions 2025-2027
15
+ - Modèle prédictif basé sur les tendances historiques
16
+ - Classification des risques (Faible/Modéré/Élevé)
17
+ - Visualisations interactives
18
 
19
+ ### 🌱 Recommandations Cultures Sensibles
20
+ - Identification des parcelles adaptées aux pois et haricot
21
+ - Score de recommandation basé sur l'IFT prédit
22
+ - Critères de sélection optimisés
23
 
24
+ ### 🔄 Alternatives Techniques
25
+ - Propositions d'alternatives mécaniques, culturales et biologiques
26
+ - Plans d'action pour réduction des herbicides
27
+ - Documentation des meilleures pratiques
28
 
29
+ ## ⚙️ Installation
30
 
31
+ ```bash
32
+ # Cloner le projet
33
+ git clone <repo-url>
34
+ cd mcp
35
 
36
+ # Installer les dépendances
37
+ pip install -r requirements.txt
38
+
39
+ # Configuration Hugging Face (optionnel)
40
+ export HF_TOKEN="your_hf_token"
41
+ export DATASET_ID="HackathonCRA/2024"
42
+ ```
43
 
44
+ ## 🚀 Lancement
45
+
46
+ ### Local
47
+ ```bash
48
+ python mcp_server.py
49
+ ```
50
+
51
+ ### Hugging Face Spaces
52
+ ```bash
53
+ python app.py
54
+ ```
55
 
56
+ Le serveur MCP sera accessible sur `http://localhost:7860`
 
 
 
 
57
 
58
+ ## 📊 Structure des Données
59
 
60
+ Les données proviennent de la Station Expérimentale de Kerguéhennec et incluent :
 
61
 
62
+ - **Variables temporelles** : millésime, dates d'intervention
63
+ - **Variables spatiales** : parcelles, surfaces
64
+ - **Variables culturales** : types de cultures, rotations
65
+ - **Variables techniques** : produits utilisés, quantités, IFT
66
 
67
+ ## 🤖 Architecture MCP
68
 
69
+ Le serveur expose des outils d'analyse via le protocole MCP :
70
 
71
+ 1. **analyze_herbicide_trends** : Analyse des tendances IFT
72
+ 2. **predict_future_weed_pressure** : Prédictions 2025-2027
73
+ 3. **recommend_sensitive_crop_plots** : Recommandations parcelles
74
+ 4. **generate_technical_alternatives** : Alternatives techniques
75
+
76
+ ## 📈 Méthodes d'Analyse
77
+
78
+ ### Calcul IFT Herbicides
79
+ ```
80
+ IFT = Nombre d'applications / Surface parcelle
81
+ ```
82
+
83
+ ### Prédiction Pression Adventices
84
+ - Régression linéaire sur données historiques
85
+ - Classification en niveaux de risque
86
+ - Extrapolation 2025-2027
87
+
88
+ ### Score de Recommandation
89
+ ```
90
+ Score = 100 - (IFT_prédit × 30)
91
+ ```
92
 
93
+ ## 🛠️ Technologies
94
 
95
+ - **Gradio** : Interface utilisateur et serveur MCP
96
+ - **Pandas/Numpy** : Traitement des données
97
+ - **Plotly** : Visualisations interactives
98
+ - **Hugging Face** : Hébergement et datasets
99
+ - **Python 3.8+** : Langage principal
100
 
101
+ ## 📝 Licence
102
 
103
+ Projet développé dans le cadre du Hackathon CRA Bretagne 2024.
104
 
105
+ ## 🤝 Contact
106
 
107
+ - **Équipe** : Hackathon CRA Bretagne
108
+ - **Données** : Station Expérimentale de Kerguéhennec
109
+ - **Support** : GitHub Issues
 
analysis_tools.py DELETED
@@ -1,368 +0,0 @@
1
- """
2
- Analysis tools for agricultural data.
3
- Provides statistical analysis and visualization capabilities.
4
- """
5
-
6
- import pandas as pd
7
- import numpy as np
8
- import matplotlib.pyplot as plt
9
- import seaborn as sns
10
- import plotly.express as px
11
- import plotly.graph_objects as go
12
- from plotly.subplots import make_subplots
13
- from sklearn.ensemble import RandomForestRegressor
14
- from sklearn.model_selection import train_test_split
15
- from sklearn.metrics import mean_squared_error, r2_score
16
- from typing import List, Dict, Optional, Tuple, Any
17
- import warnings
18
- warnings.filterwarnings('ignore')
19
-
20
-
21
- class AgriculturalAnalyzer:
22
- """Provides analysis tools for agricultural intervention data."""
23
-
24
- def __init__(self, data_loader):
25
- self.data_loader = data_loader
26
- self.prediction_models = {}
27
-
28
- def analyze_weed_pressure_trends(self,
29
- years: Optional[List[int]] = None,
30
- plots: Optional[List[str]] = None) -> Dict[str, Any]:
31
- """Analyze weed pressure trends based on herbicide usage."""
32
- herbicide_data = self.data_loader.get_herbicide_usage(years=years)
33
-
34
- if plots:
35
- herbicide_data = herbicide_data[herbicide_data['plot_name'].isin(plots)]
36
-
37
- # Calculate trends
38
- trends = {}
39
-
40
- # Overall IFT trend by year
41
- yearly_ift = herbicide_data.groupby('year')['ift_herbicide'].mean().reset_index()
42
- trends['yearly_ift'] = yearly_ift
43
-
44
- # IFT trend by plot
45
- plot_ift = herbicide_data.groupby(['plot_name', 'year'])['ift_herbicide'].mean().reset_index()
46
- trends['plot_ift'] = plot_ift
47
-
48
- # IFT trend by crop type
49
- crop_ift = herbicide_data.groupby(['crop_type', 'year'])['ift_herbicide'].mean().reset_index()
50
- trends['crop_ift'] = crop_ift
51
-
52
- # Statistical summary
53
- summary_stats = {
54
- 'mean_ift': herbicide_data['ift_herbicide'].mean(),
55
- 'std_ift': herbicide_data['ift_herbicide'].std(),
56
- 'min_ift': herbicide_data['ift_herbicide'].min(),
57
- 'max_ift': herbicide_data['ift_herbicide'].max(),
58
- 'total_applications': herbicide_data['num_applications'].sum(),
59
- 'unique_plots': herbicide_data['plot_name'].nunique(),
60
- 'unique_crops': herbicide_data['crop_type'].nunique()
61
- }
62
- trends['summary'] = summary_stats
63
-
64
- return trends
65
-
66
- def create_weed_pressure_visualization(self,
67
- years: Optional[List[int]] = None,
68
- plots: Optional[List[str]] = None) -> go.Figure:
69
- """Create interactive visualization of weed pressure trends."""
70
- trends = self.analyze_weed_pressure_trends(years=years, plots=plots)
71
-
72
- # Create subplots
73
- fig = make_subplots(
74
- rows=2, cols=2,
75
- subplot_titles=('IFT Evolution par Année', 'IFT par Parcelle',
76
- 'IFT par Type de Culture', 'Distribution IFT'),
77
- specs=[[{"secondary_y": False}, {"secondary_y": False}],
78
- [{"secondary_y": False}, {"secondary_y": False}]]
79
- )
80
-
81
- # Plot 1: Yearly IFT trend
82
- yearly_data = trends['yearly_ift']
83
- fig.add_trace(
84
- go.Scatter(x=yearly_data['year'], y=yearly_data['ift_herbicide'],
85
- mode='lines+markers', name='IFT Moyen',
86
- line=dict(color='blue')),
87
- row=1, col=1
88
- )
89
-
90
- # Plot 2: IFT by plot
91
- plot_data = trends['plot_ift']
92
- for plot in plot_data['plot_name'].unique():
93
- plot_subset = plot_data[plot_data['plot_name'] == plot]
94
- fig.add_trace(
95
- go.Scatter(x=plot_subset['year'], y=plot_subset['ift_herbicide'],
96
- mode='lines+markers', name=f'Parcelle {plot}',
97
- showlegend=False),
98
- row=1, col=2
99
- )
100
-
101
- # Plot 3: IFT by crop
102
- crop_data = trends['crop_ift']
103
- for crop in crop_data['crop_type'].unique()[:5]: # Limit to top 5 crops
104
- crop_subset = crop_data[crop_data['crop_type'] == crop]
105
- fig.add_trace(
106
- go.Scatter(x=crop_subset['year'], y=crop_subset['ift_herbicide'],
107
- mode='lines+markers', name=crop,
108
- showlegend=False),
109
- row=2, col=1
110
- )
111
-
112
- # Plot 4: IFT distribution
113
- herbicide_data = self.data_loader.get_herbicide_usage(years=years)
114
- if plots:
115
- herbicide_data = herbicide_data[herbicide_data['plot_name'].isin(plots)]
116
-
117
- fig.add_trace(
118
- go.Histogram(x=herbicide_data['ift_herbicide'],
119
- name='Distribution IFT',
120
- showlegend=False),
121
- row=2, col=2
122
- )
123
-
124
- # Update layout
125
- fig.update_layout(
126
- title_text="Analyse de la Pression Adventices (IFT Herbicides)",
127
- height=800,
128
- showlegend=True
129
- )
130
-
131
- # Update axes labels
132
- fig.update_xaxes(title_text="Année", row=1, col=1)
133
- fig.update_yaxes(title_text="IFT Herbicide", row=1, col=1)
134
- fig.update_xaxes(title_text="Année", row=1, col=2)
135
- fig.update_yaxes(title_text="IFT Herbicide", row=1, col=2)
136
- fig.update_xaxes(title_text="Année", row=2, col=1)
137
- fig.update_yaxes(title_text="IFT Herbicide", row=2, col=1)
138
- fig.update_xaxes(title_text="IFT Herbicide", row=2, col=2)
139
- fig.update_yaxes(title_text="Fréquence", row=2, col=2)
140
-
141
- return fig
142
-
143
- def analyze_crop_rotation_impact(self) -> pd.DataFrame:
144
- """Analyze the impact of crop rotation on weed pressure."""
145
- df = self.data_loader.load_all_files()
146
-
147
- # Group by plot and year to get crop sequences
148
- plot_years = df.groupby(['plot_name', 'year'])['crop_type'].first().reset_index()
149
- plot_years = plot_years.sort_values(['plot_name', 'year'])
150
-
151
- # Create rotation sequences
152
- rotations = []
153
- for plot in plot_years['plot_name'].unique():
154
- plot_data = plot_years[plot_years['plot_name'] == plot].sort_values('year')
155
- crops = plot_data['crop_type'].tolist()
156
- years = plot_data['year'].tolist()
157
-
158
- for i in range(len(crops)-1):
159
- rotations.append({
160
- 'plot_name': plot,
161
- 'year_from': years[i],
162
- 'year_to': years[i+1],
163
- 'crop_from': crops[i],
164
- 'crop_to': crops[i+1],
165
- 'rotation_type': f"{crops[i]} → {crops[i+1]}"
166
- })
167
-
168
- rotation_df = pd.DataFrame(rotations)
169
-
170
- # Get herbicide usage for each rotation
171
- herbicide_data = self.data_loader.get_herbicide_usage()
172
-
173
- # Merge with rotation data
174
- rotation_analysis = rotation_df.merge(
175
- herbicide_data[['plot_name', 'year', 'ift_herbicide']],
176
- left_on=['plot_name', 'year_to'],
177
- right_on=['plot_name', 'year'],
178
- how='left'
179
- )
180
-
181
- # Analyze rotation impact
182
- rotation_impact = rotation_analysis.groupby('rotation_type').agg({
183
- 'ift_herbicide': ['mean', 'std', 'count']
184
- }).round(3)
185
-
186
- rotation_impact.columns = ['mean_ift', 'std_ift', 'count']
187
- rotation_impact = rotation_impact.reset_index()
188
- rotation_impact = rotation_impact[rotation_impact['count'] >= 2] # At least 2 observations
189
- rotation_impact = rotation_impact.sort_values('mean_ift')
190
-
191
- return rotation_impact
192
-
193
- def predict_weed_pressure(self,
194
- target_years: List[int] = [2025, 2026, 2027],
195
- plots: Optional[List[str]] = None) -> Dict[str, Any]:
196
- """Predict weed pressure for the next 3 years."""
197
- # Prepare training data
198
- df = self.data_loader.load_all_files()
199
- herbicide_data = self.data_loader.get_herbicide_usage()
200
-
201
- # Create features for prediction
202
- features_df = []
203
-
204
- for plot in herbicide_data['plot_name'].unique():
205
- if plots and plot not in plots:
206
- continue
207
-
208
- plot_data = herbicide_data[herbicide_data['plot_name'] == plot].sort_values('year')
209
-
210
- for i in range(len(plot_data)):
211
- row = plot_data.iloc[i].copy()
212
-
213
- # Add historical features
214
- if i > 0:
215
- row['prev_ift'] = plot_data.iloc[i-1]['ift_herbicide']
216
- row['prev_crop'] = plot_data.iloc[i-1]['crop_type']
217
- else:
218
- row['prev_ift'] = 0
219
- row['prev_crop'] = 'unknown'
220
-
221
- # Add trend features
222
- if i >= 2:
223
- recent_years = plot_data.iloc[i-2:i+1]
224
- row['ift_trend'] = np.polyfit(range(3), recent_years['ift_herbicide'], 1)[0]
225
- else:
226
- row['ift_trend'] = 0
227
-
228
- features_df.append(row)
229
-
230
- features_df = pd.DataFrame(features_df)
231
-
232
- # Prepare features for ML model
233
- # Encode categorical variables
234
- crop_dummies = pd.get_dummies(features_df['crop_type'], prefix='crop')
235
- prev_crop_dummies = pd.get_dummies(features_df['prev_crop'], prefix='prev_crop')
236
- plot_dummies = pd.get_dummies(features_df['plot_name'], prefix='plot')
237
-
238
- X = pd.concat([
239
- features_df[['year', 'plot_surface', 'prev_ift', 'ift_trend']],
240
- crop_dummies,
241
- prev_crop_dummies,
242
- plot_dummies
243
- ], axis=1)
244
-
245
- y = features_df['ift_herbicide']
246
-
247
- # Remove rows with missing values
248
- mask = ~(X.isnull().any(axis=1) | y.isnull())
249
- X = X[mask]
250
- y = y[mask]
251
-
252
- # Train model
253
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
254
-
255
- model = RandomForestRegressor(n_estimators=100, random_state=42)
256
- model.fit(X_train, y_train)
257
-
258
- # Evaluate model
259
- y_pred = model.predict(X_test)
260
- mse = mean_squared_error(y_test, y_pred)
261
- r2 = r2_score(y_test, y_pred)
262
-
263
- # Make predictions for target years
264
- predictions = {}
265
-
266
- for year in target_years:
267
- year_predictions = []
268
-
269
- # Get last known data for each plot
270
- plot_columns = [col for col in X.columns if col.startswith('plot_')]
271
- unique_plots = [col.replace('plot_', '') for col in plot_columns]
272
-
273
- for plot in unique_plots:
274
- if plots and plot not in plots:
275
- continue
276
-
277
- # Find last known data for this plot
278
- plot_mask = features_df['plot_name'] == plot
279
- if not plot_mask.any():
280
- continue
281
-
282
- last_data = features_df[plot_mask].iloc[-1]
283
-
284
- # Create prediction features
285
- pred_row = pd.Series(index=X.columns, dtype=float)
286
- pred_row['year'] = year
287
- pred_row['plot_surface'] = last_data['plot_surface']
288
- pred_row['prev_ift'] = last_data['ift_herbicide']
289
- pred_row['ift_trend'] = last_data.get('ift_trend', 0)
290
-
291
- # Set plot dummy
292
- plot_col = f'plot_{plot}'
293
- if plot_col in pred_row.index:
294
- pred_row[plot_col] = 1
295
-
296
- # Assume same crop as last year for now
297
- crop_col = f'crop_{last_data["crop_type"]}'
298
- if crop_col in pred_row.index:
299
- pred_row[crop_col] = 1
300
-
301
- prev_crop_col = f'prev_crop_{last_data["crop_type"]}'
302
- if prev_crop_col in pred_row.index:
303
- pred_row[prev_crop_col] = 1
304
-
305
- # Fill missing values with 0
306
- pred_row = pred_row.fillna(0)
307
-
308
- # Make prediction
309
- pred_ift = model.predict([pred_row])[0]
310
-
311
- year_predictions.append({
312
- 'plot_name': plot,
313
- 'year': year,
314
- 'predicted_ift': pred_ift,
315
- 'risk_level': 'low' if pred_ift < 1.0 else 'medium' if pred_ift < 2.0 else 'high'
316
- })
317
-
318
- predictions[year] = pd.DataFrame(year_predictions)
319
-
320
- # Feature importance
321
- feature_importance = pd.DataFrame({
322
- 'feature': X.columns,
323
- 'importance': model.feature_importances_
324
- }).sort_values('importance', ascending=False)
325
-
326
- return {
327
- 'predictions': predictions,
328
- 'model_performance': {'mse': mse, 'r2': r2},
329
- 'feature_importance': feature_importance
330
- }
331
-
332
- def identify_suitable_plots_for_sensitive_crops(self,
333
- target_years: List[int] = [2025, 2026, 2027],
334
- max_ift_threshold: float = 1.0) -> Dict[str, List[str]]:
335
- """Identify plots suitable for sensitive crops (peas, beans) based on low weed pressure."""
336
- predictions = self.predict_weed_pressure(target_years=target_years)
337
-
338
- suitable_plots = {}
339
-
340
- for year in target_years:
341
- if year not in predictions['predictions']:
342
- continue
343
-
344
- year_data = predictions['predictions'][year]
345
- suitable = year_data[year_data['predicted_ift'] <= max_ift_threshold]
346
- suitable_plots[year] = suitable['plot_name'].tolist()
347
-
348
- return suitable_plots
349
-
350
- def analyze_herbicide_alternatives(self) -> pd.DataFrame:
351
- """Analyze herbicide usage patterns and suggest alternatives."""
352
- df = self.data_loader.load_all_files()
353
- herbicides = df[df['is_herbicide'] == True]
354
-
355
- # Analyze herbicide usage by product
356
- herbicide_usage = herbicides.groupby(['produit', 'crop_type']).agg({
357
- 'quantitetot': ['sum', 'mean', 'count'],
358
- 'codeamm': 'first'
359
- }).round(3)
360
-
361
- herbicide_usage.columns = ['total_quantity', 'avg_quantity', 'applications', 'amm_code']
362
- herbicide_usage = herbicide_usage.reset_index()
363
- herbicide_usage = herbicide_usage.sort_values('applications', ascending=False)
364
-
365
- # Identify most used herbicides
366
- top_herbicides = herbicide_usage.head(20)
367
-
368
- return top_herbicides
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -1,230 +1,19 @@
1
- import os
2
- import gradio as gr
3
-
4
- # Import your existing Gradio app and analysis tools
5
- from gradio_app import create_gradio_app
6
- from data_loader import AgriculturalDataLoader
7
- from analysis_tools import AgriculturalAnalyzer
8
-
9
- # --------- Config ---------
10
- PORT = int(os.environ.get("PORT", 7860))
11
-
12
- # Initialize agricultural components
13
- data_loader = AgriculturalDataLoader()
14
- analyzer = AgriculturalAnalyzer(data_loader)
15
-
16
- # --------- Fonctions MCP pour outils agricoles ---------
17
- @gr.mcp.tool()
18
- def analyze_weed_pressure(years: str = "", plots: str = "") -> str:
19
- """Analyze weed pressure trends using IFT herbicide data from Kerguéhennec experimental station.
20
-
21
- Args:
22
- years: Comma-separated list of years to analyze (e.g., "2020,2021,2022"). Leave empty for all years.
23
- plots: Comma-separated list of plot names to analyze (e.g., "P1,P2,P3"). Leave empty for all plots.
24
-
25
- Returns:
26
- Detailed analysis of weed pressure with IFT statistics and interpretation.
27
- """
28
- try:
29
- # Parse parameters
30
- year_list = [int(y.strip()) for y in years.split(",")] if years.strip() else None
31
- plot_list = [p.strip() for p in plots.split(",")] if plots.strip() else None
32
-
33
- trends = analyzer.analyze_weed_pressure_trends(years=year_list, plots=plot_list)
34
- summary_stats = trends['summary']
35
-
36
- result = f"""🌿 ANALYSE DE LA PRESSION ADVENTICES (IFT Herbicides)
37
-
38
- 📊 Statistiques pour les années {years or 'toutes'} et parcelles {plots or 'toutes'}:
39
- • IFT moyen: {summary_stats['mean_ift']:.2f}
40
- • Écart-type: {summary_stats['std_ift']:.2f}
41
- • IFT minimum: {summary_stats['min_ift']:.2f}
42
- • IFT maximum: {summary_stats['max_ift']:.2f}
43
- • Total applications: {summary_stats['total_applications']}
44
- • Parcelles analysées: {summary_stats['unique_plots']}
45
- • Cultures analysées: {summary_stats['unique_crops']}
46
-
47
- 💡 Interprétation:
48
- • IFT < 1.0: Pression faible (adapté aux cultures sensibles)
49
- • IFT 1.0-2.0: Pression modérée
50
- • IFT > 2.0: Pression élevée"""
51
- return result
52
- except Exception as e:
53
- return f"❌ Erreur lors de l'analyse: {str(e)}"
54
-
55
- @gr.mcp.tool()
56
- def predict_future_pressure(target_years: str = "2025,2026,2027", max_ift: float = 1.0) -> str:
57
- """Predict future weed pressure and identify suitable plots for sensitive crops.
58
-
59
- Args:
60
- target_years: Comma-separated list of years to predict (e.g., "2025,2026,2027")
61
- max_ift: Maximum IFT threshold for sensitive crops (default: 1.0)
62
-
63
- Returns:
64
- Predictions for each year with suitable plots for sensitive crops.
65
- """
66
- try:
67
- year_list = [int(y.strip()) for y in target_years.split(",")]
68
- predictions = analyzer.predict_weed_pressure(target_years=year_list)
69
- model_perf = predictions['model_performance']
70
-
71
- result = f"""🔮 PRÉDICTION DE LA PRESSION ADVENTICES
72
-
73
- 🤖 Performance du modèle:
74
- • R² Score: {model_perf['r2']:.3f}
75
- • Erreur quadratique moyenne: {model_perf['mse']:.3f}
76
-
77
- 📈 Prédictions par année:
78
  """
79
-
80
- for year in year_list:
81
- if year in predictions['predictions']:
82
- year_pred = predictions['predictions'][year]
83
- result += f"\n📅 {year}:\n"
84
- for _, row in year_pred.iterrows():
85
- result += f"• {row['plot_name']}: IFT {row['predicted_ift']:.2f} (Risque: {row['risk_level']})\n"
86
-
87
- suitable_plots = analyzer.identify_suitable_plots_for_sensitive_crops(
88
- target_years=year_list, max_ift_threshold=max_ift
89
- )
90
-
91
- result += f"\n🌱 Parcelles adaptées aux cultures sensibles (IFT < {max_ift}):\n"
92
- for year, plots in suitable_plots.items():
93
- if plots:
94
- result += f"• {year}: {', '.join(plots)}\n"
95
- else:
96
- result += f"• {year}: Aucune parcelle adaptée\n"
97
-
98
- return result
99
- except Exception as e:
100
- return f"❌ Erreur lors de la prédiction: {str(e)}"
101
-
102
- @gr.mcp.tool()
103
- def analyze_crop_rotation() -> str:
104
- """Analyze the impact of crop rotations on weed pressure at Kerguéhennec station.
105
-
106
- Returns:
107
- Analysis of the best crop rotations with lowest average IFT herbicide usage.
108
- """
109
- try:
110
- rotation_impact = analyzer.analyze_crop_rotation_impact()
111
-
112
- if rotation_impact.empty:
113
- return "📊 Pas assez de données pour analyser les rotations"
114
-
115
- result = "🔄 IMPACT DES ROTATIONS CULTURALES\n\n🏆 Meilleures rotations (IFT moyen le plus bas):\n\n"
116
-
117
- best_rotations = rotation_impact.head(10)
118
- for i, (_, row) in enumerate(best_rotations.iterrows(), 1):
119
- result += f"{i}. **{row['rotation_type']}**\n"
120
- result += f" • IFT moyen: {row['mean_ift']:.2f}\n"
121
- result += f" • Écart-type: {row['std_ift']:.2f}\n"
122
- result += f" • Observations: {row['count']}\n\n"
123
-
124
- result += "💡 Les rotations avec les IFT les plus bas sont généralement plus durables."
125
- return result
126
- except Exception as e:
127
- return f"❌ Erreur lors de l'analyse des rotations: {str(e)}"
128
-
129
- @gr.mcp.tool()
130
- def get_dataset_summary() -> str:
131
- """Get a comprehensive summary of the agricultural dataset from Kerguéhennec experimental station.
132
-
133
- Returns:
134
- Complete summary with statistics, top crops, top plots and data coverage.
135
- """
136
- try:
137
- df = data_loader.load_all_files()
138
- if df.empty:
139
- return "❌ Aucune donnée disponible"
140
-
141
- summary = f"""📊 RÉSUMÉ DU DATASET AGRICOLE - STATION DE KERGUÉHENNEC
142
-
143
- 📈 Statistiques générales:
144
- • Total d'enregistrements: {len(df):,}
145
- • Parcelles uniques: {df['plot_name'].nunique()}
146
- • Types de cultures: {df['crop_type'].nunique()}
147
- • Années couvertes: {', '.join(map(str, sorted(df['year'].unique())))}
148
- • Applications herbicides: {len(df[df['is_herbicide'] == True]):,}
149
-
150
- 🌱 Top 5 des cultures:
151
- {df['crop_type'].value_counts().head(5).to_string()}
152
-
153
- 📍 Top 5 des parcelles:
154
- {df['plot_name'].value_counts().head(5).to_string()}
155
-
156
- 🏢 Source: Station Expérimentale de Kerguéhennec"""
157
- return summary
158
- except Exception as e:
159
- return f"❌ Erreur lors du chargement des données: {str(e)}"
160
-
161
- @gr.mcp.resource("agricultural://dataset/summary")
162
- def dataset_resource() -> str:
163
- """Agricultural dataset summary resource for Kerguéhennec experimental station."""
164
- return get_dataset_summary()
165
 
166
- @gr.mcp.prompt()
167
- def agricultural_analysis_prompt(analysis_type: str = "general", focus: str = "sustainability") -> str:
168
- """Generate analysis prompts for agricultural data interpretation.
169
-
170
- Args:
171
- analysis_type: Type of analysis (general, weed_pressure, rotation, prediction)
172
- focus: Focus area (sustainability, productivity, reduction)
173
-
174
- Returns:
175
- Customized prompt for agricultural analysis.
176
- """
177
- prompts = {
178
- "general": "Analyze the agricultural data to provide insights on farming practices and sustainability",
179
- "weed_pressure": "Focus on weed pressure analysis and herbicide usage patterns",
180
- "rotation": "Examine crop rotation strategies and their impact on weed management",
181
- "prediction": "Predict future agricultural trends and provide recommendations"
182
- }
183
-
184
- focus_additions = {
185
- "sustainability": "with emphasis on sustainable and eco-friendly practices",
186
- "productivity": "focusing on maximizing crop productivity and yield",
187
- "reduction": "prioritizing herbicide reduction and organic alternatives"
188
- }
189
-
190
- base_prompt = prompts.get(analysis_type, prompts["general"])
191
- focus_addition = focus_additions.get(focus, focus_additions["sustainability"])
192
-
193
- return f"{base_prompt} {focus_addition}. Consider IFT values, crop rotations, and environmental impact in your analysis."
194
 
195
- # --------- Interface Gradio principale ---------
196
- demo = create_gradio_app()
 
197
 
198
- # --------- Lancement avec serveur MCP intégré ---------
199
  if __name__ == "__main__":
 
200
  demo.launch(
201
- mcp_server=True, # Active le serveur MCP intégré
202
- server_name="0.0.0.0",
203
- server_port=PORT,
204
- share=False
205
- )
206
-
207
- # ========= Configuration MCP pour clients =========
208
- # L'endpoint MCP sera disponible à : https://hackathoncra-mcp.hf.space/gradio_api/mcp/sse
209
- #
210
- # Configuration pour MCP Inspector ou autres clients:
211
- # {
212
- # "mcpServers": {
213
- # "agricultural-analysis": {
214
- # "url": "https://hackathoncra-mcp.hf.space/gradio_api/mcp/sse"
215
- # }
216
- # }
217
- # }
218
- #
219
- # Pour Claude Desktop (avec mcp-remote):
220
- # {
221
- # "mcpServers": {
222
- # "agricultural-analysis": {
223
- # "command": "npx",
224
- # "args": [
225
- # "mcp-remote",
226
- # "https://hackathoncra-mcp.hf.space/gradio_api/mcp/sse"
227
- # ]
228
- # }
229
- # }
230
- # }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  """
2
+ Main application launcher for Hugging Face deployment
3
+ """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ import os
6
+ from mcp_server import create_mcp_interface
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
+ # Hugging Face configuration
9
+ os.environ.setdefault("HF_TOKEN", os.environ.get("HF_TOKEN"))
10
+ os.environ.setdefault("DATASET_ID", "HackathonCRA/2024")
11
 
 
12
  if __name__ == "__main__":
13
+ demo = create_mcp_interface()
14
  demo.launch(
15
+ mcp_server=True,
16
+ server_name="0.0.0.0",
17
+ server_port=7860,
18
+ share=True
19
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
demo.py DELETED
@@ -1,218 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Demo script for the Agricultural Analysis Tool
4
- Showcases the main features and functionality of the MCP server and analysis tools.
5
- """
6
-
7
- import warnings
8
- warnings.filterwarnings('ignore')
9
-
10
- from data_loader import AgriculturalDataLoader
11
- from analysis_tools import AgriculturalAnalyzer
12
- import pandas as pd
13
-
14
- def main():
15
- """Run the demo of agricultural analysis features."""
16
-
17
- print("🚜" + "="*60)
18
- print(" AGRICULTURAL ANALYSIS TOOL - DEMO")
19
- print(" Station Expérimentale de Kerguéhennec")
20
- print("="*63)
21
- print()
22
-
23
- # Initialize components
24
- print("🔧 Initializing components...")
25
- data_loader = AgriculturalDataLoader()
26
- analyzer = AgriculturalAnalyzer(data_loader)
27
- print("✅ Components initialized successfully")
28
- print()
29
-
30
- # Load data
31
- print("📊 Loading agricultural intervention data...")
32
- df = data_loader.load_all_files()
33
- print(f"✅ Loaded {len(df):,} intervention records")
34
- print(f"📅 Data spans {df.year.nunique()} years: {sorted(df.year.unique())}")
35
- print(f"🌱 Covers {df.crop_type.nunique()} different crop types")
36
- print(f"📍 Across {df.plot_name.nunique()} different plots")
37
- print(f"💊 Including {df.is_herbicide.sum():,} herbicide applications")
38
- print()
39
-
40
- # Show top crops and plots
41
- print("🌾 TOP CROPS ANALYZED:")
42
- top_crops = df.crop_type.value_counts().head(10)
43
- for i, (crop, count) in enumerate(top_crops.items(), 1):
44
- print(f" {i:2}. {crop:<30} ({count:3} interventions)")
45
- print()
46
-
47
- print("📍 TOP PLOTS ANALYZED:")
48
- top_plots = df.plot_name.value_counts().head(10)
49
- for i, (plot, count) in enumerate(top_plots.items(), 1):
50
- print(f" {i:2}. {plot:<30} ({count:3} interventions)")
51
- print()
52
-
53
- # Analyze weed pressure
54
- print("🌿 WEED PRESSURE ANALYSIS (IFT - Treatment Frequency Index)")
55
- print("-" * 60)
56
- trends = analyzer.analyze_weed_pressure_trends()
57
- summary = trends['summary']
58
-
59
- print(f"📈 Overall IFT Statistics:")
60
- print(f" • Mean IFT: {summary['mean_ift']:.2f}")
61
- print(f" • Standard deviation: {summary['std_ift']:.2f}")
62
- print(f" • Minimum IFT: {summary['min_ift']:.2f}")
63
- print(f" • Maximum IFT: {summary['max_ift']:.2f}")
64
- print()
65
-
66
- # Show IFT trends by year
67
- if 'yearly_ift' in trends:
68
- yearly_data = pd.DataFrame(trends['yearly_ift'])
69
- print("📊 IFT Evolution by Year:")
70
- for _, row in yearly_data.iterrows():
71
- year = int(row['year'])
72
- ift = row['ift_herbicide']
73
- risk_indicator = "🟢" if ift < 1.0 else "🟡" if ift < 2.0 else "🔴"
74
- print(f" {year}: {ift:.2f} {risk_indicator}")
75
- print()
76
-
77
- # Prediction demo
78
- print("🔮 WEED PRESSURE PREDICTIONS (2025-2027)")
79
- print("-" * 60)
80
- try:
81
- predictions = analyzer.predict_weed_pressure(target_years=[2025, 2026, 2027])
82
- model_perf = predictions['model_performance']
83
- print(f"🤖 Model Performance:")
84
- print(f" • R² Score: {model_perf['r2']:.3f}")
85
- print(f" • Mean Squared Error: {model_perf['mse']:.3f}")
86
- print()
87
-
88
- # Show predictions for each year
89
- for year in [2025, 2026, 2027]:
90
- if year in predictions['predictions']:
91
- year_pred = predictions['predictions'][year]
92
- print(f"📅 Predictions for {year}:")
93
-
94
- # Group by risk level
95
- risk_counts = year_pred['risk_level'].value_counts()
96
- for risk_level in ['low', 'medium', 'high']:
97
- count = risk_counts.get(risk_level, 0)
98
- emoji = {"low": "🟢", "medium": "🟡", "high": "🔴"}[risk_level]
99
- print(f" {emoji} {risk_level.capitalize()} risk: {count} plots")
100
-
101
- # Show a few examples
102
- low_risk = year_pred[year_pred['risk_level'] == 'low']
103
- if len(low_risk) > 0:
104
- print(f" 🌱 Best plots for sensitive crops:")
105
- for _, row in low_risk.head(5).iterrows():
106
- print(f" • {row['plot_name']}: IFT {row['predicted_ift']:.2f}")
107
- print()
108
-
109
- except Exception as e:
110
- print(f"❌ Prediction error: {e}")
111
- print()
112
-
113
- # Suitable plots for sensitive crops
114
- print("🎯 PLOTS SUITABLE FOR SENSITIVE CROPS (peas, beans)")
115
- print("-" * 60)
116
- try:
117
- suitable_plots = analyzer.identify_suitable_plots_for_sensitive_crops(
118
- target_years=[2025, 2026, 2027],
119
- max_ift_threshold=1.0
120
- )
121
-
122
- for year, plots in suitable_plots.items():
123
- print(f"📅 {year}: {len(plots)} suitable plots")
124
- if plots:
125
- for plot in plots[:5]: # Show first 5
126
- print(f" ✅ {plot}")
127
- if len(plots) > 5:
128
- print(f" ... and {len(plots) - 5} more")
129
- else:
130
- print(" ❌ No plots meet the criteria")
131
- print()
132
- except Exception as e:
133
- print(f"❌ Analysis error: {e}")
134
- print()
135
-
136
- # Crop rotation analysis
137
- print("🔄 CROP ROTATION IMPACT ANALYSIS")
138
- print("-" * 60)
139
- try:
140
- rotation_impact = analyzer.analyze_crop_rotation_impact()
141
- if not rotation_impact.empty:
142
- print("🏆 Best rotations (lowest average IFT):")
143
- best_rotations = rotation_impact.head(10)
144
- for i, (_, row) in enumerate(best_rotations.iterrows(), 1):
145
- print(f" {i:2}. {row['rotation_type']:<40} IFT: {row['mean_ift']:.2f}")
146
- print()
147
-
148
- print("⚠️ Worst rotations (highest average IFT):")
149
- worst_rotations = rotation_impact.tail(5)
150
- for i, (_, row) in enumerate(worst_rotations.iterrows(), 1):
151
- print(f" {i:2}. {row['rotation_type']:<40} IFT: {row['mean_ift']:.2f}")
152
- else:
153
- print("❌ Insufficient data for rotation analysis")
154
- print()
155
- except Exception as e:
156
- print(f"❌ Rotation analysis error: {e}")
157
- print()
158
-
159
- # Herbicide usage analysis
160
- print("💊 HERBICIDE USAGE ANALYSIS")
161
- print("-" * 60)
162
- try:
163
- herbicide_analysis = analyzer.analyze_herbicide_alternatives()
164
- print("📈 Most frequently used herbicides:")
165
- top_herbicides = herbicide_analysis.head(10)
166
- for i, (_, row) in enumerate(top_herbicides.iterrows(), 1):
167
- crop_info = f" ({row['crop_type']})" if pd.notna(row['crop_type']) else ""
168
- print(f" {i:2}. {row['produit']:<30}{crop_info}")
169
- print(f" Applications: {row['applications']:<3} | Total qty: {row['total_quantity']:.1f}")
170
- print()
171
- except Exception as e:
172
- print(f"❌ Herbicide analysis error: {e}")
173
- print()
174
-
175
- # Summary and recommendations
176
- print("📋 SUMMARY AND RECOMMENDATIONS")
177
- print("="*60)
178
- print("✅ ACHIEVEMENTS:")
179
- print(" • Successfully loaded and analyzed 10 years of intervention data")
180
- print(" • Calculated weed pressure trends using IFT methodology")
181
- print(" • Developed predictive model for future weed pressure")
182
- print(" • Identified suitable plots for sensitive crops")
183
- print(" • Analyzed impact of crop rotations")
184
- print()
185
-
186
- print("🎯 KEY INSIGHTS:")
187
- avg_ift = summary['mean_ift']
188
- if avg_ift < 1.0:
189
- print(" • Overall weed pressure is LOW - good for sensitive crops")
190
- elif avg_ift < 2.0:
191
- print(" • Overall weed pressure is MODERATE - requires monitoring")
192
- else:
193
- print(" • Overall weed pressure is HIGH - needs intervention")
194
-
195
- print(f" • Current average IFT: {avg_ift:.2f}")
196
- print(f" • {df.plot_name.nunique()} plots available for analysis")
197
- print(f" • {df.crop_type.nunique()} different crop types in rotation")
198
- print()
199
-
200
- print("🚀 NEXT STEPS:")
201
- print(" • Use the Gradio interface for interactive analysis")
202
- print(" • Deploy on Hugging Face Spaces for broader access")
203
- print(" • Configure MCP server for LLM integration")
204
- print(" • Upload dataset to Hugging Face Hub")
205
- print()
206
-
207
- print("🌐 ACCESS THE TOOL:")
208
- print(" • Gradio Interface: python gradio_app.py")
209
- print(" • MCP Server: python mcp_server.py")
210
- print(" • HF Deployment: python app.py")
211
- print()
212
-
213
- print("🚜" + "="*60)
214
- print(" DEMO COMPLETED SUCCESSFULLY!")
215
- print("="*63)
216
-
217
- if __name__ == "__main__":
218
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
gradio_app.py DELETED
@@ -1,474 +0,0 @@
1
- """
2
- Gradio interface for the Agricultural MCP Server.
3
- Provides a web interface for interacting with agricultural data analysis tools.
4
- """
5
-
6
- import gradio as gr
7
- import json
8
- import pandas as pd
9
- import plotly.express as px
10
- import plotly.graph_objects as go
11
- from plotly.subplots import make_subplots
12
- import os
13
- from data_loader import AgriculturalDataLoader
14
- from analysis_tools import AgriculturalAnalyzer
15
-
16
-
17
- # Initialize components
18
- # Use Hugging Face dataset exclusively
19
- data_loader = AgriculturalDataLoader()
20
- print("🤗 Configured to use Hugging Face dataset exclusively")
21
-
22
- analyzer = AgriculturalAnalyzer(data_loader)
23
-
24
- # Global state for data
25
- def load_initial_data():
26
- """Load and cache initial data."""
27
- try:
28
- df = data_loader.load_all_files()
29
- return df
30
- except Exception as e:
31
- print(f"Error loading data: {e}")
32
- return pd.DataFrame()
33
-
34
- def get_data_summary():
35
- """Get summary of the agricultural data."""
36
- try:
37
- df = load_initial_data()
38
- if df.empty:
39
- return "Aucune donnée disponible"
40
-
41
- summary = f"""
42
- ## Résumé des Données Agricoles - Station Expérimentale de Kerguéhennec
43
-
44
- 📊 **Statistiques Générales:**
45
- - **Total d'enregistrements:** {len(df):,}
46
- - **Parcelles uniques:** {df['plot_name'].nunique()}
47
- - **Types de cultures:** {df['crop_type'].nunique()}
48
- - **Années couvertes:** {', '.join(map(str, sorted(df['year'].unique())))}
49
- - **Applications herbicides:** {len(df[df['is_herbicide'] == True]):,}
50
-
51
- 🌱 **Cultures principales:**
52
- {df['crop_type'].value_counts().head(5).to_string()}
53
-
54
- 📍 **Parcelles principales:**
55
- {df['plot_name'].value_counts().head(5).to_string()}
56
- """
57
- return summary
58
- except Exception as e:
59
- return f"Erreur lors du chargement des données: {str(e)}"
60
-
61
- def filter_and_analyze_data(years, plots, crops):
62
- """Filter data and provide analysis."""
63
- try:
64
- df = load_initial_data()
65
- if df.empty:
66
- return "Aucune donnée disponible", None
67
-
68
- # Convert inputs to lists if not None
69
- year_list = [int(y) for y in years] if years else None
70
- plot_list = plots if plots else None
71
- crop_list = crops if crops else None
72
-
73
- # Filter data
74
- filtered_df = data_loader.filter_data(
75
- years=year_list,
76
- plots=plot_list,
77
- crops=crop_list
78
- )
79
-
80
- if filtered_df.empty:
81
- return "Aucune donnée trouvée avec ces filtres", None
82
-
83
- # Generate analysis
84
- analysis = f"""
85
- ## Analyse des Données Filtrées
86
-
87
- **Filtres appliqués:**
88
- - Années: {years if years else 'Toutes'}
89
- - Parcelles: {', '.join(plots) if plots else 'Toutes'}
90
- - Cultures: {', '.join(crops) if crops else 'Toutes'}
91
-
92
- **Résultats:**
93
- - Enregistrements filtrés: {len(filtered_df):,}
94
- - Applications herbicides: {len(filtered_df[filtered_df['is_herbicide'] == True]):,}
95
- - Parcelles concernées: {filtered_df['plot_name'].nunique()}
96
- - Cultures concernées: {filtered_df['crop_type'].nunique()}
97
-
98
- **Distribution par année:**
99
- {filtered_df['year'].value_counts().sort_index().to_string()}
100
- """
101
-
102
- # Create visualization
103
- yearly_dist = filtered_df['year'].value_counts().sort_index()
104
- fig = px.bar(
105
- x=yearly_dist.index,
106
- y=yearly_dist.values,
107
- title="Distribution des Interventions par Année",
108
- labels={'x': 'Année', 'y': 'Nombre d\'Interventions'}
109
- )
110
-
111
- return analysis, fig
112
-
113
- except Exception as e:
114
- return f"Erreur lors de l'analyse: {str(e)}", None
115
-
116
- def analyze_weed_pressure(years, plots):
117
- """Analyze weed pressure trends."""
118
- try:
119
- # Convert inputs
120
- year_list = [int(y) for y in years] if years else None
121
- plot_list = plots if plots else None
122
-
123
- # Get analysis
124
- trends = analyzer.analyze_weed_pressure_trends(years=year_list, plots=plot_list)
125
-
126
- # Format results
127
- summary_stats = trends['summary']
128
- analysis_text = f"""
129
- ## Analyse de la Pression Adventices (IFT Herbicides)
130
-
131
- **Statistiques globales:**
132
- - IFT moyen: {summary_stats['mean_ift']:.2f}
133
- - Écart-type: {summary_stats['std_ift']:.2f}
134
- - IFT minimum: {summary_stats['min_ift']:.2f}
135
- - IFT maximum: {summary_stats['max_ift']:.2f}
136
- - Total applications: {summary_stats['total_applications']}
137
- - Parcelles analysées: {summary_stats['unique_plots']}
138
- - Cultures analysées: {summary_stats['unique_crops']}
139
-
140
- **Interprétation:**
141
- - IFT < 1.0: Pression faible (adapté aux cultures sensibles)
142
- - IFT 1.0-2.0: Pression modérée
143
- - IFT > 2.0: Pression élevée
144
- """
145
-
146
- # Create visualization
147
- fig = analyzer.create_weed_pressure_visualization(years=year_list, plots=plot_list)
148
-
149
- return analysis_text, fig
150
-
151
- except Exception as e:
152
- return f"Erreur lors de l'analyse de pression: {str(e)}", None
153
-
154
- def predict_future_weed_pressure(target_years, max_ift):
155
- """Predict weed pressure for future years."""
156
- try:
157
- # Convert target years
158
- year_list = [int(y) for y in target_years] if target_years else [2025, 2026, 2027]
159
-
160
- # Get predictions
161
- predictions = analyzer.predict_weed_pressure(target_years=year_list)
162
-
163
- # Format results
164
- model_perf = predictions['model_performance']
165
- results_text = f"""
166
- ## Prédiction de la Pression Adventices
167
-
168
- **Performance du modèle:**
169
- - R² Score: {model_perf['r2']:.3f}
170
- - Erreur quadratique moyenne: {model_perf['mse']:.3f}
171
-
172
- **Prédictions par année:**
173
- """
174
-
175
- # Add predictions for each year
176
- prediction_data = []
177
- for year in year_list:
178
- if year in predictions['predictions']:
179
- year_pred = predictions['predictions'][year]
180
- results_text += f"\n**{year}:**\n"
181
-
182
- for _, row in year_pred.iterrows():
183
- results_text += f"- {row['plot_name']}: IFT {row['predicted_ift']:.2f} (Risque: {row['risk_level']})\n"
184
- prediction_data.append({
185
- 'Année': year,
186
- 'Parcelle': row['plot_name'],
187
- 'IFT_Prédit': row['predicted_ift'],
188
- 'Niveau_Risque': row['risk_level']
189
- })
190
-
191
- # Identify suitable plots
192
- suitable_plots = analyzer.identify_suitable_plots_for_sensitive_crops(
193
- target_years=year_list,
194
- max_ift_threshold=max_ift
195
- )
196
-
197
- results_text += f"\n\n**Parcelles adaptées aux cultures sensibles (IFT < {max_ift}):**\n"
198
- for year, plots in suitable_plots.items():
199
- if plots:
200
- results_text += f"- {year}: {', '.join(plots)}\n"
201
- else:
202
- results_text += f"- {year}: Aucune parcelle adaptée\n"
203
-
204
- # Create visualization
205
- if prediction_data:
206
- pred_df = pd.DataFrame(prediction_data)
207
- fig = px.scatter(
208
- pred_df,
209
- x='Année',
210
- y='IFT_Prédit',
211
- color='Niveau_Risque',
212
- size='IFT_Prédit',
213
- hover_data=['Parcelle'],
214
- title="Prédictions IFT par Parcelle et Année",
215
- color_discrete_map={'low': 'green', 'medium': 'orange', 'high': 'red'}
216
- )
217
- fig.add_hline(y=max_ift, line_dash="dash", line_color="red",
218
- annotation_text=f"Seuil cultures sensibles ({max_ift})")
219
-
220
- return results_text, fig
221
- else:
222
- return results_text, None
223
-
224
- except Exception as e:
225
- return f"Erreur lors de la prédiction: {str(e)}", None
226
-
227
- def analyze_crop_rotation():
228
- """Analyze crop rotation impact."""
229
- try:
230
- rotation_impact = analyzer.analyze_crop_rotation_impact()
231
-
232
- if rotation_impact.empty:
233
- return "Pas assez de données pour analyser les rotations", None
234
-
235
- analysis_text = f"""
236
- ## Impact des Rotations sur la Pression Adventices
237
-
238
- **Rotations les plus favorables (IFT moyen le plus bas):**
239
- """
240
-
241
- # Show top 10 best rotations
242
- best_rotations = rotation_impact.head(10)
243
- for _, row in best_rotations.iterrows():
244
- analysis_text += f"\n- **{row['rotation_type']}**"
245
- analysis_text += f"\n - IFT moyen: {row['mean_ift']:.2f}"
246
- analysis_text += f"\n - Écart-type: {row['std_ift']:.2f}"
247
- analysis_text += f"\n - Observations: {row['count']}\n"
248
-
249
- # Create visualization
250
- top_20 = rotation_impact.head(20)
251
- fig = px.bar(
252
- top_20,
253
- x='mean_ift',
254
- y='rotation_type',
255
- orientation='h',
256
- title="Impact des Rotations sur l'IFT Herbicide (Top 20)",
257
- labels={'mean_ift': 'IFT Moyen', 'rotation_type': 'Type de Rotation'},
258
- color='mean_ift',
259
- color_continuous_scale='RdYlGn_r'
260
- )
261
- fig.update_layout(height=800)
262
-
263
- return analysis_text, fig
264
-
265
- except Exception as e:
266
- return f"Erreur lors de l'analyse des rotations: {str(e)}", None
267
-
268
- def analyze_herbicide_usage():
269
- """Analyze herbicide usage patterns."""
270
- try:
271
- herbicide_analysis = analyzer.analyze_herbicide_alternatives()
272
-
273
- analysis_text = f"""
274
- ## Analyse des Herbicides Utilisés
275
-
276
- **Herbicides les plus utilisés:**
277
- """
278
-
279
- top_herbicides = herbicide_analysis.head(15)
280
- for _, row in top_herbicides.iterrows():
281
- analysis_text += f"\n- **{row['produit']}** ({row['crop_type']})"
282
- analysis_text += f"\n - Applications: {row['applications']}"
283
- analysis_text += f"\n - Quantité totale: {row['total_quantity']:.1f}"
284
- analysis_text += f"\n - Quantité moyenne: {row['avg_quantity']:.1f}"
285
- if not pd.isna(row['amm_code']):
286
- analysis_text += f"\n - Code AMM: {row['amm_code']}"
287
- analysis_text += "\n"
288
-
289
- # Create visualization
290
- fig = px.bar(
291
- top_herbicides.head(10),
292
- x='applications',
293
- y='produit',
294
- orientation='h',
295
- title="Herbicides les Plus Utilisés (Nombre d'Applications)",
296
- labels={'applications': 'Nombre d\'Applications', 'produit': 'Produit'},
297
- color='applications'
298
- )
299
- fig.update_layout(height=600)
300
-
301
- return analysis_text, fig
302
-
303
- except Exception as e:
304
- return f"Erreur lors de l'analyse des herbicides: {str(e)}", None
305
-
306
- # Create Gradio interface
307
- def create_gradio_app():
308
- """Create the Gradio application."""
309
-
310
- # Load data for dropdowns
311
- try:
312
- df = load_initial_data()
313
- available_years = sorted(df['year'].unique()) if not df.empty else []
314
- available_plots = sorted(df['plot_name'].unique()) if not df.empty else []
315
- available_crops = sorted(df['crop_type'].unique()) if not df.empty else []
316
- except:
317
- available_years = []
318
- available_plots = []
319
- available_crops = []
320
-
321
- with gr.Blocks(title="🚜 Analyse Agricole - Station de Kerguéhennec", theme=gr.themes.Soft()) as app:
322
- gr.Markdown("""
323
- # 🚜 Analyse des Données Agricoles
324
- ## Station Expérimentale de Kerguéhennec
325
-
326
- ### Outil d'aide à la décision pour la réduction des herbicides et l'identification des parcelles adaptées aux cultures sensibles
327
- """)
328
-
329
- with gr.Tabs():
330
- # Tab 1: Data Overview
331
- with gr.Tab("📊 Aperçu des Données"):
332
- gr.Markdown("## Résumé des données disponibles")
333
- summary_output = gr.Markdown(value=get_data_summary())
334
- refresh_btn = gr.Button("🔄 Actualiser", variant="secondary")
335
- refresh_btn.click(get_data_summary, outputs=summary_output)
336
-
337
- # Tab 2: Data Filtering
338
- with gr.Tab("🔍 Filtrage et Exploration"):
339
- gr.Markdown("## Filtrer et explorer les données")
340
-
341
- with gr.Row():
342
- with gr.Column():
343
- years_filter = gr.CheckboxGroup(
344
- choices=[str(y) for y in available_years],
345
- label="Années",
346
- value=[str(y) for y in available_years[-3:]] if available_years else []
347
- )
348
- plots_filter = gr.CheckboxGroup(
349
- choices=available_plots,
350
- label="Parcelles",
351
- value=available_plots[:5] if available_plots else []
352
- )
353
- crops_filter = gr.CheckboxGroup(
354
- choices=available_crops,
355
- label="Cultures",
356
- value=available_crops[:5] if available_crops else []
357
- )
358
-
359
- analyze_btn = gr.Button("📈 Analyser", variant="primary")
360
-
361
- with gr.Column():
362
- filter_results = gr.Markdown()
363
- filter_plot = gr.Plot()
364
-
365
- analyze_btn.click(
366
- filter_and_analyze_data,
367
- inputs=[years_filter, plots_filter, crops_filter],
368
- outputs=[filter_results, filter_plot]
369
- )
370
-
371
- # Tab 3: Weed Pressure Analysis
372
- with gr.Tab("🌿 Pression Adventices"):
373
- gr.Markdown("## Analyse de la pression adventices (IFT Herbicides)")
374
-
375
- with gr.Row():
376
- with gr.Column():
377
- years_pressure = gr.CheckboxGroup(
378
- choices=[str(y) for y in available_years],
379
- label="Années à analyser",
380
- value=[str(y) for y in available_years] if available_years else []
381
- )
382
- plots_pressure = gr.CheckboxGroup(
383
- choices=available_plots,
384
- label="Parcelles à analyser",
385
- value=available_plots if len(available_plots) <= 10 else available_plots[:10]
386
- )
387
-
388
- pressure_btn = gr.Button("🔬 Analyser la Pression", variant="primary")
389
-
390
- with gr.Column():
391
- pressure_results = gr.Markdown()
392
- pressure_plot = gr.Plot()
393
-
394
- pressure_btn.click(
395
- analyze_weed_pressure,
396
- inputs=[years_pressure, plots_pressure],
397
- outputs=[pressure_results, pressure_plot]
398
- )
399
-
400
- # Tab 4: Predictions
401
- with gr.Tab("🔮 Prédictions"):
402
- gr.Markdown("## Prédiction de la pression adventices")
403
-
404
- with gr.Row():
405
- with gr.Column():
406
- target_years = gr.CheckboxGroup(
407
- choices=["2025", "2026", "2027"],
408
- label="Années à prédire",
409
- value=["2025", "2026", "2027"]
410
- )
411
- max_ift = gr.Slider(
412
- minimum=0.5,
413
- maximum=3.0,
414
- value=1.0,
415
- step=0.1,
416
- label="Seuil IFT max pour cultures sensibles"
417
- )
418
-
419
- predict_btn = gr.Button("🎯 Prédire", variant="primary")
420
-
421
- with gr.Column():
422
- prediction_results = gr.Markdown()
423
- prediction_plot = gr.Plot()
424
-
425
- predict_btn.click(
426
- predict_future_weed_pressure,
427
- inputs=[target_years, max_ift],
428
- outputs=[prediction_results, prediction_plot]
429
- )
430
-
431
- # Tab 5: Crop Rotation
432
- with gr.Tab("🔄 Rotations"):
433
- gr.Markdown("## Impact des rotations culturales")
434
-
435
- rotation_btn = gr.Button("📊 Analyser les Rotations", variant="primary")
436
- rotation_results = gr.Markdown()
437
- rotation_plot = gr.Plot()
438
-
439
- rotation_btn.click(
440
- analyze_crop_rotation,
441
- outputs=[rotation_results, rotation_plot]
442
- )
443
-
444
- # Tab 6: Herbicide Analysis
445
- with gr.Tab("💊 Herbicides"):
446
- gr.Markdown("## Analyse des herbicides utilisés")
447
-
448
- herbicide_btn = gr.Button("🧪 Analyser les Herbicides", variant="primary")
449
- herbicide_results = gr.Markdown()
450
- herbicide_plot = gr.Plot()
451
-
452
- herbicide_btn.click(
453
- analyze_herbicide_usage,
454
- outputs=[herbicide_results, herbicide_plot]
455
- )
456
-
457
- gr.Markdown("""
458
- ---
459
- **Note:** Cet outil utilise les données historiques d'interventions de la Station Expérimentale de Kerguéhennec
460
- pour analyser la pression adventices et identifier les parcelles les plus adaptées aux cultures sensibles
461
- comme le pois et le haricot.
462
- """)
463
-
464
- return app
465
-
466
- # Launch the app
467
- if __name__ == "__main__":
468
- app = create_gradio_app()
469
- app.launch(
470
- server_name="0.0.0.0",
471
- server_port=7860,
472
- share=True,
473
- debug=True
474
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
hf_integration.py DELETED
@@ -1,313 +0,0 @@
1
- """
2
- Hugging Face integration for dataset management and model deployment.
3
- """
4
-
5
- import os
6
- import pandas as pd
7
- from datasets import Dataset, DatasetDict
8
- from huggingface_hub import HfApi, create_repo, upload_file
9
- from pathlib import Path
10
- from typing import Optional, Dict, Any
11
- import json
12
-
13
- class HuggingFaceIntegration:
14
- """Handles Hugging Face dataset and model operations."""
15
-
16
- def __init__(self, token: Optional[str] = None, dataset_id: str = "HackathonCRA/2024"):
17
- self.token = token or os.environ.get("HF_TOKEN")
18
- self.dataset_id = dataset_id
19
- self.api = HfApi(token=self.token) if self.token else None
20
-
21
- def prepare_dataset_from_local_files(self, data_path: str) -> Dataset:
22
- """Prepare dataset from local CSV/Excel files."""
23
- from data_loader import AgriculturalDataLoader
24
-
25
- # Load and combine all data files
26
- loader = AgriculturalDataLoader(data_path=data_path)
27
- df = loader.load_all_files()
28
-
29
- # Convert to Hugging Face Dataset
30
- dataset = Dataset.from_pandas(df)
31
-
32
- return dataset
33
-
34
- def upload_dataset(self, data_path: str, private: bool = False) -> str:
35
- """Upload agricultural data to Hugging Face Hub."""
36
- if not self.token:
37
- raise ValueError("HF_TOKEN required for uploading")
38
-
39
- # Prepare dataset
40
- dataset = self.prepare_dataset_from_local_files(data_path)
41
-
42
- # Create repository if it doesn't exist
43
- try:
44
- create_repo(
45
- repo_id=self.dataset_id,
46
- token=self.token,
47
- repo_type="dataset",
48
- private=private,
49
- exist_ok=True
50
- )
51
- except Exception as e:
52
- print(f"Repository might already exist: {e}")
53
-
54
- # Upload dataset
55
- dataset.push_to_hub(
56
- repo_id=self.dataset_id,
57
- token=self.token,
58
- private=private
59
- )
60
-
61
- return f"Dataset uploaded to https://huggingface.co/datasets/{self.dataset_id}"
62
-
63
- def create_dataset_card(self) -> str:
64
- """Create a dataset card for the agricultural data."""
65
- card_content = """
66
- ---
67
- license: cc-by-4.0
68
- task_categories:
69
- - tabular-regression
70
- - time-series-forecasting
71
- language:
72
- - fr
73
- tags:
74
- - agriculture
75
- - herbicides
76
- - weed-pressure
77
- - crop-rotation
78
- - france
79
- - bretagne
80
- size_categories:
81
- - 1K<n<10K
82
- ---
83
-
84
- # 🚜 Station Expérimentale de Kerguéhennec - Agricultural Interventions Dataset
85
-
86
- ## Dataset Description
87
-
88
- This dataset contains agricultural intervention records from the Station Expérimentale de Kerguéhennec in Brittany, France, spanning from 2014 to 2024. The data includes detailed information about agricultural practices, crop rotations, herbicide treatments, and field management operations.
89
-
90
- ## Dataset Summary
91
-
92
- - **Source**: Station Expérimentale de Kerguéhennec
93
- - **Time Period**: 2014-2024
94
- - **Location**: Brittany, France
95
- - **Records**: ~10,000+ intervention records
96
- - **Format**: CSV/Excel exports from farm management system
97
-
98
- ## Use Cases
99
-
100
- This dataset is particularly valuable for:
101
-
102
- 1. **Weed Pressure Analysis**: Calculate and predict Treatment Frequency Index (IFT) for herbicides
103
- 2. **Crop Rotation Optimization**: Analyze the impact of different crop sequences on pest pressure
104
- 3. **Sustainable Agriculture**: Support reduction of herbicide use while maintaining productivity
105
- 4. **Precision Agriculture**: Identify suitable plots for sensitive crops (peas, beans)
106
- 5. **Agricultural Research**: Study relationships between practices and outcomes
107
-
108
- ## Data Fields
109
-
110
- ### Core Fields
111
- - `millesime`: Year of intervention
112
- - `nomparc`: Plot/field name
113
- - `surfparc`: Plot surface area (hectares)
114
- - `libelleusag`: Crop type/usage
115
- - `datedebut`/`datefin`: Intervention start/end dates
116
- - `libevenem`: Intervention type
117
- - `familleprod`: Product family (herbicides, fungicides, etc.)
118
- - `produit`: Specific product used
119
- - `quantitetot`: Total quantity applied
120
- - `unite`: Unit of measurement
121
-
122
- ### Derived Fields
123
- - `year`: Intervention year
124
- - `crop_type`: Standardized crop classification
125
- - `is_herbicide`: Boolean flag for herbicide treatments
126
- - `ift_herbicide`: Treatment Frequency Index calculation
127
-
128
- ## Data Quality
129
-
130
- - All personal identifying information has been removed
131
- - Geographic coordinates are generalized to protect farm location
132
- - Product codes (AMM) are preserved for regulatory analysis
133
- - Missing values are clearly marked and documented
134
-
135
- ## Methodology
136
-
137
- ### IFT Calculation
138
- The Treatment Frequency Index (IFT) is calculated as:
139
- ```
140
- IFT = Number of applications / Plot surface area
141
- ```
142
-
143
- This metric is crucial for:
144
- - Regulatory compliance monitoring
145
- - Sustainable practice assessment
146
- - Risk evaluation for sensitive crops
147
-
148
- ## Applications
149
-
150
- ### 1. Weed Pressure Prediction
151
- Use machine learning models to predict future IFT values based on:
152
- - Historical treatment patterns
153
- - Crop rotation sequences
154
- - Environmental factors
155
- - Plot characteristics
156
-
157
- ### 2. Sustainable Plot Selection
158
- Identify plots suitable for sensitive crops (peas, beans) by:
159
- - Analyzing historical IFT trends
160
- - Evaluating rotation impacts
161
- - Assessing risk levels
162
-
163
- ### 3. Alternative Strategy Development
164
- Support herbicide reduction strategies through:
165
- - Product usage pattern analysis
166
- - Rotation optimization recommendations
167
- - Risk assessment frameworks
168
-
169
- ## Citation
170
-
171
- If you use this dataset in your research, please cite:
172
-
173
- ```
174
- @dataset{hackathon_cra_2024,
175
- title={Station Expérimentale de Kerguéhennec Agricultural Interventions Dataset},
176
- author={Hackathon CRA Team},
177
- year={2024},
178
- publisher={Hugging Face},
179
- url={https://huggingface.co/datasets/HackathonCRA/2024}
180
- }
181
- ```
182
-
183
- ## License
184
-
185
- This dataset is released under CC-BY-4.0 license, allowing for both commercial and research use with proper attribution.
186
-
187
- ## Contact
188
-
189
- For questions about this dataset or collaboration opportunities, please contact the research team through the Hugging Face dataset page.
190
-
191
- ---
192
-
193
- **Keywords**: agriculture, herbicides, crop rotation, sustainable farming, France, Brittany, IFT, weed management, precision agriculture
194
- """
195
- return card_content
196
-
197
- def upload_app_space(self, local_app_path: str, space_name: str = "agricultural-analysis") -> str:
198
- """Upload the Gradio app as a Hugging Face Space."""
199
- if not self.token:
200
- raise ValueError("HF_TOKEN required for uploading")
201
-
202
- repo_id = f"{self.api.whoami()['name']}/{space_name}"
203
-
204
- # Create Space repository
205
- try:
206
- create_repo(
207
- repo_id=repo_id,
208
- token=self.token,
209
- repo_type="space",
210
- space_sdk="gradio",
211
- private=False,
212
- exist_ok=True
213
- )
214
- except Exception as e:
215
- print(f"Space might already exist: {e}")
216
-
217
- # Upload files
218
- app_files = [
219
- "app.py",
220
- "requirements.txt",
221
- "gradio_app.py",
222
- "data_loader.py",
223
- "analysis_tools.py",
224
- "mcp_server.py",
225
- "README.md"
226
- ]
227
-
228
- for file_name in app_files:
229
- file_path = Path(local_app_path) / file_name
230
- if file_path.exists():
231
- upload_file(
232
- path_or_fileobj=str(file_path),
233
- path_in_repo=file_name,
234
- repo_id=repo_id,
235
- repo_type="space",
236
- token=self.token
237
- )
238
- print(f"Uploaded {file_name}")
239
-
240
- return f"Space created at https://huggingface.co/spaces/{repo_id}"
241
-
242
- def create_space_readme(self) -> str:
243
- """Create README for Hugging Face Space."""
244
- readme_content = """
245
- ---
246
- title: Agricultural Analysis - Kerguéhennec
247
- emoji: 🚜
248
- colorFrom: green
249
- colorTo: blue
250
- sdk: gradio
251
- sdk_version: 4.0.0
252
- app_file: app.py
253
- pinned: false
254
- license: cc-by-4.0
255
- ---
256
-
257
- # 🚜 Agricultural Analysis - Station de Kerguéhennec
258
-
259
- Outil d'analyse des données agricoles pour l'optimisation des pratiques phytosanitaires et l'identification des parcelles adaptées aux cultures sensibles.
260
-
261
- ## Fonctionnalités
262
-
263
- - 📊 Analyse des données d'interventions agricoles
264
- - 🌿 Évaluation de la pression adventices (IFT)
265
- - 🔮 Prédictions pour les 3 prochaines années
266
- - 🔄 Analyse de l'impact des rotations culturales
267
- - 💊 Étude des herbicides utilisés
268
- - 🎯 Identification des parcelles pour cultures sensibles
269
-
270
- ## Utilisation
271
-
272
- 1. Sélectionnez l'onglet correspondant à votre analyse
273
- 2. Configurez les filtres selon vos besoins
274
- 3. Lancez l'analyse pour obtenir les résultats
275
- 4. Explorez les visualisations interactives
276
-
277
- ## Données
278
-
279
- Basé sur les données de la Station Expérimentale de Kerguéhennec (2014-2024).
280
- """
281
- return readme_content
282
-
283
- def setup_environment_variables(self) -> Dict[str, str]:
284
- """Setup environment variables for Hugging Face deployment."""
285
- env_vars = {
286
- "HF_TOKEN": self.token or "your_hf_token_here",
287
- "DATASET_ID": self.dataset_id,
288
- "GRADIO_SERVER_NAME": "0.0.0.0",
289
- "GRADIO_SERVER_PORT": "7860"
290
- }
291
-
292
- return env_vars
293
-
294
- # Usage example
295
- if __name__ == "__main__":
296
- # Initialize HF integration
297
- hf = HuggingFaceIntegration()
298
-
299
- # Upload dataset (requires HF_TOKEN)
300
- if hf.token:
301
- try:
302
- result = hf.upload_dataset("/Users/tracyandre/Downloads/OneDrive_1_9-17-2025")
303
- print(result)
304
- except Exception as e:
305
- print(f"Dataset upload failed: {e}")
306
-
307
- # Create dataset card
308
- card = hf.create_dataset_card()
309
- print("Dataset card created")
310
-
311
- # Show environment setup
312
- env_vars = hf.setup_environment_variables()
313
- print("Environment variables:", env_vars)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
hf_usage_example.py DELETED
@@ -1,214 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Example usage of the agricultural data loader with Hugging Face integration.
4
- Shows different ways to load and use the data.
5
- """
6
-
7
- import os
8
- import warnings
9
- warnings.filterwarnings('ignore')
10
-
11
- from data_loader import AgriculturalDataLoader
12
- from analysis_tools import AgriculturalAnalyzer
13
-
14
- def example_local_usage():
15
- """Example: Load from local files."""
16
- print("📁 EXAMPLE 1: Loading from local files")
17
- print("-" * 40)
18
-
19
- # Create loader for local files
20
- loader = AgriculturalDataLoader.create_local_loader(
21
- data_path="/Users/tracyandre/Downloads/OneDrive_1_9-17-2025"
22
- )
23
-
24
- # Load and analyze data
25
- df = loader.load_all_files()
26
- print(f"✅ Loaded {len(df):,} records from local files")
27
-
28
- # Basic analysis
29
- analyzer = AgriculturalAnalyzer(loader)
30
- trends = analyzer.analyze_weed_pressure_trends()
31
- print(f"📊 Average IFT: {trends['summary']['mean_ift']:.2f}")
32
-
33
- return df
34
-
35
- def example_hf_usage():
36
- """Example: Load from Hugging Face (if available)."""
37
- print("\n🤗 EXAMPLE 2: Loading from Hugging Face")
38
- print("-" * 40)
39
-
40
- # Check if HF token is available
41
- if not os.environ.get("HF_TOKEN"):
42
- print("⚠️ No HF_TOKEN found - skipping HF example")
43
- print("💡 Set HF_TOKEN environment variable to use this feature")
44
- return None
45
-
46
- try:
47
- # Create loader for Hugging Face
48
- loader = AgriculturalDataLoader.create_hf_loader(
49
- dataset_id="HackathonCRA/2024"
50
- )
51
-
52
- # Load and analyze data
53
- df = loader.load_all_files()
54
- print(f"✅ Loaded {len(df):,} records from Hugging Face")
55
-
56
- # Basic analysis
57
- analyzer = AgriculturalAnalyzer(loader)
58
- trends = analyzer.analyze_weed_pressure_trends()
59
- print(f"📊 Average IFT: {trends['summary']['mean_ift']:.2f}")
60
-
61
- return df
62
-
63
- except Exception as e:
64
- print(f"❌ Failed to load from Hugging Face: {e}")
65
- return None
66
-
67
- def example_automatic_fallback():
68
- """Example: Automatic fallback from HF to local."""
69
- print("\n🔄 EXAMPLE 3: Automatic fallback")
70
- print("-" * 40)
71
-
72
- # Create loader with HF preferred but local fallback
73
- loader = AgriculturalDataLoader(
74
- data_path="/Users/tracyandre/Downloads/OneDrive_1_9-17-2025",
75
- dataset_id="HackathonCRA/2024",
76
- use_hf=True # Try HF first
77
- )
78
-
79
- # This will try HF first, then fallback to local if needed
80
- df = loader.load_all_files()
81
- print(f"✅ Loaded {len(df):,} records (with automatic source selection)")
82
-
83
- return df
84
-
85
- def example_dynamic_switching():
86
- """Example: Dynamic switching between sources."""
87
- print("\n🔀 EXAMPLE 4: Dynamic source switching")
88
- print("-" * 40)
89
-
90
- # Create loader
91
- loader = AgriculturalDataLoader(
92
- data_path="/Users/tracyandre/Downloads/OneDrive_1_9-17-2025",
93
- dataset_id="HackathonCRA/2024"
94
- )
95
-
96
- # Load from local first
97
- loader.set_data_source(use_hf=False)
98
- df_local = loader.load_all_files()
99
- print(f"📁 Local source: {len(df_local):,} records")
100
-
101
- # Switch to HF (if available)
102
- if os.environ.get("HF_TOKEN"):
103
- try:
104
- loader.set_data_source(use_hf=True)
105
- df_hf = loader.load_all_files()
106
- print(f"🤗 HF source: {len(df_hf):,} records")
107
-
108
- # Compare
109
- if len(df_local) == len(df_hf):
110
- print("✅ Data consistency verified")
111
- else:
112
- print(f"⚠️ Data mismatch: {abs(len(df_local) - len(df_hf))} record difference")
113
-
114
- except Exception as e:
115
- print(f"🤗 HF switching failed: {e}")
116
- else:
117
- print("⚠️ No HF_TOKEN - skipping HF switch test")
118
-
119
- return df_local
120
-
121
- def example_production_deployment():
122
- """Example: Production deployment configuration."""
123
- print("\n🚀 EXAMPLE 5: Production deployment setup")
124
- print("-" * 40)
125
-
126
- # Production configuration
127
- # This is how you'd set it up for Hugging Face Spaces deployment
128
-
129
- print("💡 For Hugging Face Spaces deployment:")
130
- print("1. Set HF_TOKEN as a Space secret")
131
- print("2. Configure the loader as follows:")
132
- print()
133
-
134
- config_code = '''
135
- # In your app.py or gradio_app.py
136
- import os
137
- from data_loader import AgriculturalDataLoader
138
-
139
- # Production configuration
140
- hf_token = os.environ.get("HF_TOKEN")
141
- dataset_id = "HackathonCRA/2024"
142
-
143
- if hf_token:
144
- # Use HF dataset in production
145
- data_loader = AgriculturalDataLoader.create_hf_loader(
146
- dataset_id=dataset_id,
147
- hf_token=hf_token
148
- )
149
- print("🤗 Using Hugging Face dataset")
150
- else:
151
- # Fallback for local development
152
- data_loader = AgriculturalDataLoader.create_local_loader(
153
- data_path="./data" # Local data directory
154
- )
155
- print("📁 Using local files")
156
- '''
157
-
158
- print(config_code)
159
-
160
- # Example of actual production setup
161
- try:
162
- hf_token = os.environ.get("HF_TOKEN")
163
- if hf_token:
164
- loader = AgriculturalDataLoader.create_hf_loader("HackathonCRA/2024", hf_token)
165
- print("✅ Production setup: HF dataset configured")
166
- else:
167
- loader = AgriculturalDataLoader.create_local_loader("/Users/tracyandre/Downloads/OneDrive_1_9-17-2025")
168
- print("✅ Development setup: Local files configured")
169
-
170
- df = loader.load_all_files()
171
- print(f"📊 Ready for production: {len(df):,} records available")
172
-
173
- except Exception as e:
174
- print(f"❌ Production setup failed: {e}")
175
-
176
- def main():
177
- """Run all examples."""
178
- print("🚜 AGRICULTURAL DATA LOADER - USAGE EXAMPLES")
179
- print("=" * 60)
180
-
181
- # Run examples
182
- example_local_usage()
183
- example_hf_usage()
184
- example_automatic_fallback()
185
- example_dynamic_switching()
186
- example_production_deployment()
187
-
188
- print("\n" + "=" * 60)
189
- print("🎯 SUMMARY")
190
- print("=" * 60)
191
- print("""
192
- The AgriculturalDataLoader now supports:
193
-
194
- ✅ Local file loading (CSV/Excel)
195
- ✅ Hugging Face dataset loading
196
- ✅ Automatic fallback (HF → Local)
197
- ✅ Dynamic source switching
198
- ✅ Production deployment ready
199
-
200
- Key benefits:
201
- 🔄 Seamless data source switching
202
- 🚀 Cloud deployment ready
203
- 📊 Same analysis tools work with both sources
204
- 🔧 Easy configuration management
205
- """)
206
-
207
- print("🛠️ Next steps:")
208
- print("1. Upload your dataset to Hugging Face Hub")
209
- print("2. Set HF_TOKEN environment variable")
210
- print("3. Deploy to Hugging Face Spaces")
211
- print("4. Enjoy cloud-based agricultural analysis!")
212
-
213
- if __name__ == "__main__":
214
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
launch.py DELETED
@@ -1,170 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Launch script for the Agricultural Analysis Tool
4
- Simple launcher with menu options for different modes.
5
- """
6
-
7
- import sys
8
- import os
9
- import subprocess
10
- import warnings
11
- warnings.filterwarnings('ignore')
12
-
13
- def print_banner():
14
- """Print the application banner."""
15
- print("🚜" + "="*70)
16
- print(" AGRICULTURAL ANALYSIS TOOL - STATION DE KERGUÉHENNEC")
17
- print(" Hackathon CRA - Réduction des herbicides")
18
- print("="*73)
19
- print()
20
-
21
- def check_dependencies():
22
- """Check if all required dependencies are installed."""
23
- print("🔧 Checking dependencies...")
24
- try:
25
- import pandas, numpy, matplotlib, seaborn, sklearn, gradio, plotly
26
- from data_loader import AgriculturalDataLoader
27
- from analysis_tools import AgriculturalAnalyzer
28
- print("✅ All dependencies are installed")
29
- return True
30
- except ImportError as e:
31
- print(f"❌ Missing dependency: {e}")
32
- print("Please run: pip install -r requirements.txt")
33
- return False
34
-
35
- def test_data_loading():
36
- """Test if data can be loaded successfully."""
37
- print("📊 Testing data loading...")
38
- try:
39
- from data_loader import AgriculturalDataLoader
40
- loader = AgriculturalDataLoader()
41
- df = loader.load_all_files()
42
- print(f"✅ Successfully loaded {len(df):,} records")
43
- return True
44
- except Exception as e:
45
- print(f"❌ Data loading failed: {e}")
46
- return False
47
-
48
- def launch_gradio():
49
- """Launch the Gradio interface."""
50
- print("🚀 Launching Gradio interface...")
51
- print("📱 The app will open in your web browser")
52
- print("🌐 Access at: http://localhost:7860")
53
- print("⏹️ Press Ctrl+C to stop the server")
54
- print()
55
-
56
- try:
57
- from gradio_app import create_gradio_app
58
- app = create_gradio_app()
59
- app.launch(
60
- server_name="0.0.0.0",
61
- server_port=7860,
62
- share=False,
63
- debug=False,
64
- quiet=False
65
- )
66
- except KeyboardInterrupt:
67
- print("\n🛑 Server stopped by user")
68
- except Exception as e:
69
- print(f"❌ Failed to launch Gradio: {e}")
70
-
71
- def launch_mcp_server():
72
- """Launch the MCP server."""
73
- print("🤖 Launching MCP Server...")
74
- print("📡 Server will run in Model Context Protocol mode")
75
- print("⏹️ Press Ctrl+C to stop the server")
76
- print()
77
-
78
- try:
79
- subprocess.run([sys.executable, "mcp_server.py"])
80
- except KeyboardInterrupt:
81
- print("\n🛑 MCP Server stopped by user")
82
- except Exception as e:
83
- print(f"❌ Failed to launch MCP server: {e}")
84
-
85
- def run_demo():
86
- """Run the demonstration."""
87
- print("🎬 Running comprehensive demo...")
88
- print()
89
-
90
- try:
91
- subprocess.run([sys.executable, "demo.py"])
92
- except Exception as e:
93
- print(f"❌ Demo failed: {e}")
94
-
95
- def show_menu():
96
- """Show the main menu."""
97
- print("📋 Choose an option:")
98
- print()
99
- print("1. 🌐 Launch Gradio Web Interface (Recommended)")
100
- print("2. 🤖 Launch MCP Server")
101
- print("3. 🎬 Run Demo")
102
- print("4. 🔧 Check System Status")
103
- print("5. ❌ Exit")
104
- print()
105
-
106
- def main():
107
- """Main launcher function."""
108
- print_banner()
109
-
110
- # Check dependencies first
111
- if not check_dependencies():
112
- return
113
-
114
- # Test data loading
115
- if not test_data_loading():
116
- return
117
-
118
- print("🎯 System ready!")
119
- print()
120
-
121
- while True:
122
- show_menu()
123
-
124
- try:
125
- choice = input("Enter your choice (1-5): ").strip()
126
-
127
- if choice == "1":
128
- print()
129
- launch_gradio()
130
- print()
131
-
132
- elif choice == "2":
133
- print()
134
- launch_mcp_server()
135
- print()
136
-
137
- elif choice == "3":
138
- print()
139
- run_demo()
140
- print()
141
- input("Press Enter to continue...")
142
- print()
143
-
144
- elif choice == "4":
145
- print()
146
- print("🔍 System Status Check:")
147
- check_dependencies()
148
- test_data_loading()
149
- print()
150
- input("Press Enter to continue...")
151
- print()
152
-
153
- elif choice == "5":
154
- print()
155
- print("👋 Goodbye! Thank you for using the Agricultural Analysis Tool")
156
- break
157
-
158
- else:
159
- print("❌ Invalid choice. Please enter a number between 1-5.")
160
- print()
161
-
162
- except KeyboardInterrupt:
163
- print("\n\n👋 Goodbye! Thank you for using the Agricultural Analysis Tool")
164
- break
165
- except Exception as e:
166
- print(f"❌ Error: {e}")
167
- print()
168
-
169
- if __name__ == "__main__":
170
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
mcp.code-workspace DELETED
@@ -1,11 +0,0 @@
1
- {
2
- "folders": [
3
- {
4
- "path": "."
5
- },
6
- {
7
- "path": "../../../Downloads/OneDrive_1_9-17-2025"
8
- }
9
- ],
10
- "settings": {}
11
- }
 
 
 
 
 
 
 
 
 
 
 
 
mcp_server.py CHANGED
@@ -1,433 +1,296 @@
1
- """
2
- MCP Server for Agricultural Data Analysis
3
- Provides tools and resources for analyzing agricultural intervention data.
4
- """
5
 
6
- import json
7
- import logging
8
- from typing import Any, Dict, List, Optional
9
- from mcp.server import Server
10
- from mcp.server.models import InitializationOptions
11
- from mcp.server.stdio import stdio_server
12
- from mcp.types import Resource, Tool, TextContent
13
- import asyncio
14
  import pandas as pd
 
 
15
  from data_loader import AgriculturalDataLoader
16
- from analysis_tools import AgriculturalAnalyzer
17
- import plotly.io as pio
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- # Set up logging
21
- logging.basicConfig(level=logging.INFO)
22
- logger = logging.getLogger("agricultural-mcp-server")
23
 
24
- # Initialize data components
25
- data_loader = AgriculturalDataLoader()
26
- analyzer = AgriculturalAnalyzer(data_loader)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
- # Create MCP server
29
- server = Server("agricultural-analysis")
 
 
30
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- @server.list_resources()
33
- async def list_resources() -> List[Resource]:
34
- """List available resources."""
35
- return [
36
- Resource(
37
- uri="agricultural://data/summary",
38
- name="Data Summary",
39
- mimeType="application/json",
40
- description="Summary of available agricultural intervention data"
41
- ),
42
- Resource(
43
- uri="agricultural://data/years",
44
- name="Available Years",
45
- mimeType="application/json",
46
- description="List of years with available data"
47
- ),
48
- Resource(
49
- uri="agricultural://data/plots",
50
- name="Available Plots",
51
- mimeType="application/json",
52
- description="List of available plots/parcels"
53
- ),
54
- Resource(
55
- uri="agricultural://data/crops",
56
- name="Available Crops",
57
- mimeType="application/json",
58
- description="List of available crop types"
59
- ),
60
- Resource(
61
- uri="agricultural://analysis/weed-pressure",
62
- name="Weed Pressure Analysis",
63
- mimeType="application/json",
64
- description="Current weed pressure trends analysis"
65
- ),
66
- Resource(
67
- uri="agricultural://analysis/rotation-impact",
68
- name="Crop Rotation Impact",
69
- mimeType="application/json",
70
- description="Analysis of crop rotation impact on weed pressure"
71
- )
72
- ]
73
 
 
 
 
 
 
 
 
 
 
 
74
 
75
- @server.read_resource()
76
- async def read_resource(uri: str) -> str:
77
- """Read a specific resource."""
78
  try:
79
- if uri == "agricultural://data/summary":
80
- df = data_loader.load_all_files()
81
- summary = {
82
- "total_records": len(df),
83
- "date_range": {
84
- "start": df['datedebut'].min().strftime('%Y-%m-%d') if df['datedebut'].min() else None,
85
- "end": df['datedebut'].max().strftime('%Y-%m-%d') if df['datedebut'].max() else None
86
- },
87
- "unique_plots": df['plot_name'].nunique(),
88
- "unique_crops": df['crop_type'].nunique(),
89
- "herbicide_applications": len(df[df['is_herbicide'] == True]),
90
- "years_covered": sorted(df['year'].unique().tolist())
91
- }
92
- return json.dumps(summary, indent=2)
93
-
94
- elif uri == "agricultural://data/years":
95
- years = data_loader.get_years_available()
96
- return json.dumps({"available_years": years})
97
 
98
- elif uri == "agricultural://data/plots":
99
- plots = data_loader.get_plots_available()
100
- return json.dumps({"available_plots": plots})
101
 
102
- elif uri == "agricultural://data/crops":
103
- crops = data_loader.get_crops_available()
104
- return json.dumps({"available_crops": crops})
105
-
106
- elif uri == "agricultural://analysis/weed-pressure":
107
- trends = analyzer.analyze_weed_pressure_trends()
108
- # Convert DataFrames to dict for JSON serialization
109
- serializable_trends = {}
110
- for key, value in trends.items():
111
- if isinstance(value, pd.DataFrame):
112
- serializable_trends[key] = value.to_dict('records')
113
- else:
114
- serializable_trends[key] = value
115
- return json.dumps(serializable_trends, indent=2)
116
 
117
- elif uri == "agricultural://analysis/rotation-impact":
118
- rotation_impact = analyzer.analyze_crop_rotation_impact()
119
- return json.dumps(rotation_impact.to_dict('records'), indent=2)
 
 
 
120
 
 
121
  else:
122
- raise ValueError(f"Unknown resource: {uri}")
123
-
124
  except Exception as e:
125
- logger.error(f"Error reading resource {uri}: {e}")
126
- return json.dumps({"error": str(e)})
 
 
 
 
127
 
 
 
 
 
128
 
129
- @server.list_tools()
130
- async def list_tools() -> List[Tool]:
131
- """List available tools."""
132
- return [
133
- Tool(
134
- name="filter_data",
135
- description="Filter agricultural data by years, plots, crops, or intervention types",
136
- inputSchema={
137
- "type": "object",
138
- "properties": {
139
- "years": {
140
- "type": "array",
141
- "items": {"type": "integer"},
142
- "description": "List of years to filter (e.g., [2022, 2023, 2024])"
143
- },
144
- "plots": {
145
- "type": "array",
146
- "items": {"type": "string"},
147
- "description": "List of plot names to filter"
148
- },
149
- "crops": {
150
- "type": "array",
151
- "items": {"type": "string"},
152
- "description": "List of crop types to filter"
153
- },
154
- "intervention_types": {
155
- "type": "array",
156
- "items": {"type": "string"},
157
- "description": "List of intervention types to filter"
158
- }
159
- }
160
- }
161
- ),
162
- Tool(
163
- name="analyze_weed_pressure",
164
- description="Analyze weed pressure trends based on herbicide usage (IFT)",
165
- inputSchema={
166
- "type": "object",
167
- "properties": {
168
- "years": {
169
- "type": "array",
170
- "items": {"type": "integer"},
171
- "description": "Years to analyze"
172
- },
173
- "plots": {
174
- "type": "array",
175
- "items": {"type": "string"},
176
- "description": "Plots to analyze"
177
- },
178
- "include_visualization": {
179
- "type": "boolean",
180
- "description": "Whether to include visualization data",
181
- "default": True
182
- }
183
- }
184
- }
185
- ),
186
- Tool(
187
- name="predict_weed_pressure",
188
- description="Predict weed pressure for the next 3 years using machine learning",
189
- inputSchema={
190
- "type": "object",
191
- "properties": {
192
- "target_years": {
193
- "type": "array",
194
- "items": {"type": "integer"},
195
- "description": "Years to predict (default: [2025, 2026, 2027])",
196
- "default": [2025, 2026, 2027]
197
- },
198
- "plots": {
199
- "type": "array",
200
- "items": {"type": "string"},
201
- "description": "Specific plots to predict for (optional)"
202
- }
203
- }
204
- }
205
- ),
206
- Tool(
207
- name="identify_suitable_plots",
208
- description="Identify plots suitable for sensitive crops (peas, beans) based on low weed pressure",
209
- inputSchema={
210
- "type": "object",
211
- "properties": {
212
- "target_years": {
213
- "type": "array",
214
- "items": {"type": "integer"},
215
- "description": "Years to evaluate (default: [2025, 2026, 2027])",
216
- "default": [2025, 2026, 2027]
217
- },
218
- "max_ift_threshold": {
219
- "type": "number",
220
- "description": "Maximum IFT threshold for suitable plots (default: 1.0)",
221
- "default": 1.0
222
- }
223
- }
224
- }
225
- ),
226
- Tool(
227
- name="analyze_crop_rotation",
228
- description="Analyze the impact of crop rotation patterns on weed pressure",
229
- inputSchema={
230
- "type": "object",
231
- "properties": {}
232
- }
233
- ),
234
- Tool(
235
- name="analyze_herbicide_alternatives",
236
- description="Analyze herbicide usage patterns and identify most used products",
237
- inputSchema={
238
- "type": "object",
239
- "properties": {}
240
- }
241
- ),
242
- Tool(
243
- name="get_data_statistics",
244
- description="Get comprehensive statistics about the agricultural data",
245
- inputSchema={
246
- "type": "object",
247
- "properties": {
248
- "years": {
249
- "type": "array",
250
- "items": {"type": "integer"},
251
- "description": "Years to analyze (optional)"
252
- },
253
- "plots": {
254
- "type": "array",
255
- "items": {"type": "string"},
256
- "description": "Plots to analyze (optional)"
257
- }
258
- }
259
- }
260
- )
261
- ]
262
 
 
 
 
 
263
 
264
- @server.call_tool()
265
- async def call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]:
266
- """Execute a tool call."""
 
 
 
 
 
 
 
 
267
  try:
268
- if name == "filter_data":
269
- df = data_loader.filter_data(
270
- years=arguments.get("years"),
271
- plots=arguments.get("plots"),
272
- crops=arguments.get("crops"),
273
- intervention_types=arguments.get("intervention_types")
274
- )
275
-
276
- result = {
277
- "filtered_records": len(df),
278
- "summary": {
279
- "unique_plots": df['plot_name'].nunique(),
280
- "unique_crops": df['crop_type'].nunique(),
281
- "year_range": [int(df['year'].min()), int(df['year'].max())] if len(df) > 0 else [],
282
- "herbicide_applications": len(df[df['is_herbicide'] == True])
283
- },
284
- "sample_data": df.head(10).to_dict('records') if len(df) > 0 else []
285
- }
286
-
287
- return [TextContent(
288
- type="text",
289
- text=json.dumps(result, indent=2, default=str)
290
- )]
291
-
292
- elif name == "analyze_weed_pressure":
293
- trends = analyzer.analyze_weed_pressure_trends(
294
- years=arguments.get("years"),
295
- plots=arguments.get("plots")
296
- )
297
-
298
- # Convert DataFrames to dict for JSON serialization
299
- serializable_trends = {}
300
- for key, value in trends.items():
301
- if isinstance(value, pd.DataFrame):
302
- serializable_trends[key] = value.to_dict('records')
303
- else:
304
- serializable_trends[key] = value
305
-
306
- # Include visualization if requested
307
- if arguments.get("include_visualization", True):
308
- try:
309
- fig = analyzer.create_weed_pressure_visualization(
310
- years=arguments.get("years"),
311
- plots=arguments.get("plots")
312
- )
313
- # Convert plot to HTML
314
- serializable_trends["visualization_html"] = pio.to_html(fig, include_plotlyjs=True)
315
- except Exception as e:
316
- serializable_trends["visualization_error"] = str(e)
317
-
318
- return [TextContent(
319
- type="text",
320
- text=json.dumps(serializable_trends, indent=2, default=str)
321
- )]
322
-
323
- elif name == "predict_weed_pressure":
324
- predictions = analyzer.predict_weed_pressure(
325
- target_years=arguments.get("target_years", [2025, 2026, 2027]),
326
- plots=arguments.get("plots")
327
- )
328
-
329
- # Convert DataFrames to dict for JSON serialization
330
- serializable_predictions = {}
331
- for key, value in predictions.items():
332
- if key == "predictions":
333
- serializable_predictions[key] = {}
334
- for year, df in value.items():
335
- serializable_predictions[key][year] = df.to_dict('records')
336
- elif isinstance(value, pd.DataFrame):
337
- serializable_predictions[key] = value.to_dict('records')
338
- else:
339
- serializable_predictions[key] = value
340
-
341
- return [TextContent(
342
- type="text",
343
- text=json.dumps(serializable_predictions, indent=2, default=str)
344
- )]
345
-
346
- elif name == "identify_suitable_plots":
347
- suitable_plots = analyzer.identify_suitable_plots_for_sensitive_crops(
348
- target_years=arguments.get("target_years", [2025, 2026, 2027]),
349
- max_ift_threshold=arguments.get("max_ift_threshold", 1.0)
350
- )
351
-
352
- return [TextContent(
353
- type="text",
354
- text=json.dumps(suitable_plots, indent=2)
355
- )]
356
-
357
- elif name == "analyze_crop_rotation":
358
- rotation_impact = analyzer.analyze_crop_rotation_impact()
359
-
360
- return [TextContent(
361
- type="text",
362
- text=json.dumps(rotation_impact.to_dict('records'), indent=2, default=str)
363
- )]
364
-
365
- elif name == "analyze_herbicide_alternatives":
366
- herbicide_analysis = analyzer.analyze_herbicide_alternatives()
367
-
368
- return [TextContent(
369
- type="text",
370
- text=json.dumps(herbicide_analysis.to_dict('records'), indent=2, default=str)
371
- )]
372
-
373
- elif name == "get_data_statistics":
374
- df = data_loader.filter_data(
375
- years=arguments.get("years"),
376
- plots=arguments.get("plots")
377
- )
378
-
379
- stats = {
380
- "general": {
381
- "total_records": len(df),
382
- "unique_plots": df['plot_name'].nunique(),
383
- "unique_crops": df['crop_type'].nunique(),
384
- "date_range": {
385
- "start": df['datedebut'].min().strftime('%Y-%m-%d') if not df['datedebut'].isna().all() else None,
386
- "end": df['datedebut'].max().strftime('%Y-%m-%d') if not df['datedebut'].isna().all() else None
387
- }
388
- },
389
- "interventions": {
390
- "total_herbicide": len(df[df['is_herbicide'] == True]),
391
- "total_fungicide": len(df[df['is_fungicide'] == True]),
392
- "total_insecticide": len(df[df['is_insecticide'] == True])
393
- },
394
- "top_crops": df['crop_type'].value_counts().head(10).to_dict(),
395
- "top_plots": df['plot_name'].value_counts().head(10).to_dict(),
396
- "yearly_distribution": df['year'].value_counts().sort_index().to_dict()
397
- }
398
 
399
- return [TextContent(
400
- type="text",
401
- text=json.dumps(stats, indent=2, default=str)
402
- )]
 
 
 
 
403
 
404
- else:
405
- raise ValueError(f"Unknown tool: {name}")
 
 
 
 
 
 
406
 
407
- except Exception as e:
408
- logger.error(f"Error executing tool {name}: {e}")
409
- return [TextContent(
410
- type="text",
411
- text=json.dumps({"error": str(e)}, indent=2)
412
- )]
413
-
414
-
415
- async def main():
416
- """Main function to run the MCP server."""
417
- logger.info("Starting Agricultural MCP Server...")
418
 
419
- # Initialize the server
420
- async with stdio_server() as (read_stream, write_stream):
421
- await server.run(
422
- read_stream,
423
- write_stream,
424
- InitializationOptions(
425
- server_name="agricultural-analysis",
426
- server_version="1.0.0",
427
- capabilities=server.get_capabilities()
428
- )
429
- )
430
-
431
 
432
  if __name__ == "__main__":
433
- asyncio.run(main())
 
 
1
+ """MCP Server for Agricultural Weed Pressure Analysis"""
 
 
 
2
 
3
+ import gradio as gr
 
 
 
 
 
 
 
4
  import pandas as pd
5
+ import numpy as np
6
+ import plotly.express as px
7
  from data_loader import AgriculturalDataLoader
8
+ import warnings
9
+ warnings.filterwarnings('ignore')
10
 
11
+ class WeedPressureAnalyzer:
12
+ """Analyze weed pressure and recommend plots for sensitive crops."""
13
+
14
+ def __init__(self):
15
+ self.data_loader = AgriculturalDataLoader()
16
+ self.data_cache = None
17
+
18
+ def load_data(self):
19
+ if self.data_cache is None:
20
+ self.data_cache = self.data_loader.load_all_files()
21
+ return self.data_cache
22
+
23
+ def calculate_herbicide_ift(self, years=None):
24
+ """Calculate IFT for herbicides by plot and year."""
25
+ df = self.load_data()
26
+
27
+ if years:
28
+ df = df[df['year'].isin(years)]
29
+
30
+ herbicide_df = df[df['is_herbicide'] == True].copy()
31
+
32
+ if len(herbicide_df) == 0:
33
+ return pd.DataFrame()
34
+
35
+ ift_summary = herbicide_df.groupby(['plot_name', 'year', 'crop_type']).agg({
36
+ 'produit': 'count',
37
+ 'plot_surface': 'first',
38
+ 'quantitetot': 'sum'
39
+ }).reset_index()
40
+
41
+ ift_summary['ift_herbicide'] = ift_summary['produit'] / ift_summary['plot_surface']
42
+
43
+ return ift_summary
44
+
45
+ def predict_weed_pressure(self, target_years=[2025, 2026, 2027]):
46
+ """Predict weed pressure for future years."""
47
+ ift_data = self.calculate_herbicide_ift()
48
+
49
+ if len(ift_data) == 0:
50
+ return pd.DataFrame()
51
+
52
+ predictions = []
53
+
54
+ for plot in ift_data['plot_name'].unique():
55
+ plot_data = ift_data[ift_data['plot_name'] == plot].sort_values('year')
56
+
57
+ if len(plot_data) < 2:
58
+ continue
59
+
60
+ years = plot_data['year'].values
61
+ ift_values = plot_data['ift_herbicide'].values
62
+
63
+ if len(years) > 1:
64
+ slope = np.polyfit(years, ift_values, 1)[0]
65
+ intercept = np.polyfit(years, ift_values, 1)[1]
66
+
67
+ for target_year in target_years:
68
+ predicted_ift = slope * target_year + intercept
69
+ predicted_ift = max(0, predicted_ift)
70
+
71
+ if predicted_ift < 1.0:
72
+ risk_level = "Faible"
73
+ elif predicted_ift < 2.0:
74
+ risk_level = "Modéré"
75
+ else:
76
+ risk_level = "Élevé"
77
+
78
+ predictions.append({
79
+ 'plot_name': plot,
80
+ 'year': target_year,
81
+ 'predicted_ift': predicted_ift,
82
+ 'risk_level': risk_level,
83
+ 'recent_crops': ', '.join(plot_data['crop_type'].tail(3).unique()),
84
+ 'historical_avg_ift': plot_data['ift_herbicide'].mean()
85
+ })
86
+
87
+ return pd.DataFrame(predictions)
88
 
89
+ # Initialize analyzer
90
+ analyzer = WeedPressureAnalyzer()
 
91
 
92
+ def analyze_herbicide_trends(years_range, plot_filter):
93
+ """Analyze herbicide usage trends over time."""
94
+ try:
95
+ if len(years_range) == 2:
96
+ years = list(range(int(years_range[0]), int(years_range[1]) + 1))
97
+ else:
98
+ years = [int(y) for y in years_range]
99
+
100
+ ift_data = analyzer.calculate_herbicide_ift(years=years)
101
+
102
+ if len(ift_data) == 0:
103
+ return None, "Aucune donnée d'herbicides trouvée."
104
+
105
+ if plot_filter != "Toutes":
106
+ ift_data = ift_data[ift_data['plot_name'] == plot_filter]
107
+
108
+ fig = px.line(ift_data,
109
+ x='year',
110
+ y='ift_herbicide',
111
+ color='plot_name',
112
+ title=f'Évolution de l\'IFT Herbicides',
113
+ labels={'ift_herbicide': 'IFT Herbicides', 'year': 'Année'})
114
+
115
+ summary = f"""
116
+ 📊 **Analyse de l'IFT Herbicides**
117
 
118
+ **Statistiques:**
119
+ - IFT moyen: {ift_data['ift_herbicide'].mean():.2f}
120
+ - IFT maximum: {ift_data['ift_herbicide'].max():.2f}
121
+ - Nombre de parcelles: {ift_data['plot_name'].nunique()}
122
 
123
+ **Interprétation:**
124
+ - IFT < 1.0: Pression faible ✅
125
+ - IFT 1.0-2.0: Pression modérée ⚠️
126
+ - IFT > 2.0: Pression élevée ❌
127
+ """
128
+
129
+ return fig, summary
130
+
131
+ except Exception as e:
132
+ return None, f"Erreur: {str(e)}"
133
 
134
+ def predict_future_weed_pressure():
135
+ """Predict weed pressure for the next 3 years."""
136
+ try:
137
+ predictions = analyzer.predict_weed_pressure()
138
+
139
+ if len(predictions) == 0:
140
+ return None, "Impossible de générer des prédictions."
141
+
142
+ fig = px.bar(predictions,
143
+ x='plot_name',
144
+ y='predicted_ift',
145
+ color='risk_level',
146
+ facet_col='year',
147
+ title='Prédiction Pression Adventices (2025-2027)',
148
+ color_discrete_map={'Faible': 'green', 'Modéré': 'orange', 'Élevé': 'red'})
149
+
150
+ low_risk = len(predictions[predictions['risk_level'] == 'Faible'])
151
+ moderate_risk = len(predictions[predictions['risk_level'] == 'Modéré'])
152
+ high_risk = len(predictions[predictions['risk_level'] == 'Élevé'])
153
+
154
+ summary = f"""
155
+ 🔮 **Prédictions 2025-2027**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
 
157
+ **Répartition des risques:**
158
+ - ✅ Risque faible: {low_risk} prédictions
159
+ - ⚠️ Risque modéré: {moderate_risk} prédictions
160
+ - ❌ Risque élevé: {high_risk} prédictions
161
+ """
162
+
163
+ return fig, summary
164
+
165
+ except Exception as e:
166
+ return None, f"Erreur: {str(e)}"
167
 
168
+ def recommend_sensitive_crop_plots():
169
+ """Recommend plots for sensitive crops."""
 
170
  try:
171
+ predictions = analyzer.predict_weed_pressure()
172
+
173
+ if len(predictions) == 0:
174
+ return None, "Aucune recommandation disponible."
175
+
176
+ suitable_plots = predictions[predictions['risk_level'] == "Faible"].copy()
177
+
178
+ if len(suitable_plots) > 0:
179
+ suitable_plots['recommendation_score'] = 100 - (suitable_plots['predicted_ift'] * 30)
180
+ suitable_plots = suitable_plots.sort_values('recommendation_score', ascending=False)
 
 
 
 
 
 
 
 
181
 
182
+ top_recommendations = suitable_plots.head(10)[['plot_name', 'year', 'predicted_ift', 'recommendation_score']]
 
 
183
 
184
+ summary = f"""
185
+ 🌱 **Recommandations Cultures Sensibles**
186
+
187
+ **Top parcelles recommandées:**
188
+ {top_recommendations.to_string(index=False)}
189
+
190
+ **Critères:** IFT prédit < 1.0 (faible pression adventices)
191
+ """
 
 
 
 
 
 
192
 
193
+ fig = px.scatter(suitable_plots,
194
+ x='predicted_ift',
195
+ y='recommendation_score',
196
+ color='year',
197
+ hover_data=['plot_name'],
198
+ title='Parcelles Recommandées pour Cultures Sensibles')
199
 
200
+ return fig, summary
201
  else:
202
+ return None, "Aucune parcelle à faible risque identifiée."
203
+
204
  except Exception as e:
205
+ return None, f"Erreur: {str(e)}"
206
+
207
+ def generate_technical_alternatives(herbicide_family):
208
+ """Generate technical alternatives."""
209
+ summary = f"""
210
+ 🔄 **Alternatives aux {herbicide_family}**
211
 
212
+ **🚜 Alternatives Mécaniques:**
213
+ • Faux-semis répétés avant implantation
214
+ • Binage mécanique en inter-rang
215
+ • Herse étrille en post-levée précoce
216
 
217
+ **🌾 Alternatives Culturales:**
218
+ Rotation longue avec prairie temporaire
219
+ Cultures intermédiaires piège à nitrates
220
+ Densité de semis optimisée
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
221
 
222
+ **🧪 Alternatives Biologiques:**
223
+ • Stimulateurs de défenses naturelles
224
+ • Extraits végétaux (huiles essentielles)
225
+ • Bioherbicides à base de champignons
226
 
227
+ **📋 Plan d'Action:**
228
+ 1. Tester sur petites surfaces
229
+ 2. Former les équipes
230
+ 3. Suivre l'efficacité
231
+ 4. Documenter les résultats
232
+ """
233
+
234
+ return summary
235
+
236
+ def get_available_plots():
237
+ """Get available plots."""
238
  try:
239
+ plots = analyzer.data_loader.get_plots_available()
240
+ return ["Toutes"] + plots
241
+ except:
242
+ return ["Toutes"]
243
+
244
+ # Create Gradio Interface
245
+ def create_mcp_interface():
246
+ with gr.Blocks(title="🚜 Analyse Pression Adventices", theme=gr.themes.Soft()) as demo:
247
+ gr.Markdown("""
248
+ # 🚜 Analyse Pression Adventices - CRA Bretagne
249
+
250
+ Anticiper et réduire la pression des adventices pour optimiser les cultures sensibles (pois, haricot).
251
+ """)
252
+
253
+ with gr.Tabs():
254
+ with gr.Tab("📈 Analyse Tendances"):
255
+ with gr.Row():
256
+ years_slider = gr.Slider(2014, 2024, value=[2020, 2024], step=1, label="Période")
257
+ plot_dropdown = gr.Dropdown(choices=get_available_plots(), value="Toutes", label="Parcelle")
258
+
259
+ analyze_btn = gr.Button("🔍 Analyser", variant="primary")
260
+
261
+ with gr.Row():
262
+ trends_plot = gr.Plot()
263
+ trends_summary = gr.Markdown()
264
+
265
+ analyze_btn.click(analyze_herbicide_trends, [years_slider, plot_dropdown], [trends_plot, trends_summary])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
266
 
267
+ with gr.Tab("🔮 Prédictions"):
268
+ predict_btn = gr.Button("🎯 Prédire 2025-2027", variant="primary")
269
+
270
+ with gr.Row():
271
+ predictions_plot = gr.Plot()
272
+ predictions_summary = gr.Markdown()
273
+
274
+ predict_btn.click(predict_future_weed_pressure, outputs=[predictions_plot, predictions_summary])
275
 
276
+ with gr.Tab("🌱 Recommandations"):
277
+ recommend_btn = gr.Button("🎯 Recommander Parcelles", variant="primary")
278
+
279
+ with gr.Row():
280
+ recommendations_plot = gr.Plot()
281
+ recommendations_summary = gr.Markdown()
282
+
283
+ recommend_btn.click(recommend_sensitive_crop_plots, outputs=[recommendations_plot, recommendations_summary])
284
 
285
+ with gr.Tab("🔄 Alternatives"):
286
+ herbicide_type = gr.Dropdown(["Herbicides", "Fongicides"], value="Herbicides", label="Type")
287
+ alternatives_btn = gr.Button("💡 Générer Alternatives", variant="primary")
288
+ alternatives_output = gr.Markdown()
289
+
290
+ alternatives_btn.click(generate_technical_alternatives, [herbicide_type], [alternatives_output])
 
 
 
 
 
291
 
292
+ return demo
 
 
 
 
 
 
 
 
 
 
 
293
 
294
  if __name__ == "__main__":
295
+ demo = create_mcp_interface()
296
+ demo.launch(mcp_server=True, server_name="0.0.0.0", server_port=7860, share=True)
requirements.txt CHANGED
@@ -1,10 +1,8 @@
1
- gradio[mcp]>=4.43
2
  pandas>=2.0.0
3
  numpy>=1.24.0
4
- matplotlib>=3.6.0
5
- seaborn>=0.12.0
6
- scikit-learn>=1.3.0
7
  datasets>=2.14.0
8
- huggingface_hub>=0.17.0
9
- openpyxl>=3.1.0
10
- plotly>=5.15.0
 
1
+ gradio>=4.0.0
2
  pandas>=2.0.0
3
  numpy>=1.24.0
4
+ plotly>=5.0.0
 
 
5
  datasets>=2.14.0
6
+ huggingface_hub>=0.16.0
7
+ matplotlib>=3.7.0
8
+ seaborn>=0.12.0
test_data_sources.py DELETED
@@ -1,190 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Test script to demonstrate loading data from both local files and Hugging Face.
4
- """
5
-
6
- import warnings
7
- warnings.filterwarnings('ignore')
8
-
9
- from data_loader import AgriculturalDataLoader
10
- import os
11
-
12
- def test_local_loading():
13
- """Test loading from local files."""
14
- print("🔍 TESTING LOCAL FILE LOADING")
15
- print("=" * 50)
16
-
17
- try:
18
- # Create loader for local files
19
- loader = AgriculturalDataLoader.create_local_loader(
20
- data_path="/Users/tracyandre/Downloads/OneDrive_1_9-17-2025"
21
- )
22
-
23
- # Load data
24
- df = loader.load_all_files()
25
-
26
- print(f"✅ Local loading successful!")
27
- print(f"📊 Records: {len(df):,}")
28
- print(f"📅 Years: {sorted(df['year'].unique())}")
29
- print(f"🌱 Crops: {df['crop_type'].nunique()}")
30
- print(f"📍 Plots: {df['plot_name'].nunique()}")
31
-
32
- return True
33
-
34
- except Exception as e:
35
- print(f"❌ Local loading failed: {e}")
36
- return False
37
-
38
- def test_hf_loading():
39
- """Test loading from Hugging Face."""
40
- print("\n🤗 TESTING HUGGING FACE LOADING")
41
- print("=" * 50)
42
-
43
- # Check if HF token is available
44
- hf_token = os.environ.get("HF_TOKEN")
45
- if not hf_token:
46
- print("⚠️ No HF_TOKEN found in environment variables")
47
- print("💡 Set HF_TOKEN to test Hugging Face loading")
48
- return False
49
-
50
- try:
51
- # Create loader for Hugging Face
52
- loader = AgriculturalDataLoader.create_hf_loader(
53
- dataset_id="HackathonCRA/2024",
54
- hf_token=hf_token
55
- )
56
-
57
- # Load data
58
- df = loader.load_from_huggingface()
59
-
60
- print(f"✅ Hugging Face loading successful!")
61
- print(f"📊 Records: {len(df):,}")
62
- print(f"📅 Years: {sorted(df['year'].unique())}")
63
- print(f"🌱 Crops: {df['crop_type'].nunique()}")
64
- print(f"📍 Plots: {df['plot_name'].nunique()}")
65
-
66
- return True
67
-
68
- except Exception as e:
69
- print(f"❌ Hugging Face loading failed: {e}")
70
- print("💡 Make sure the dataset exists and you have access")
71
- return False
72
-
73
- def test_auto_fallback():
74
- """Test automatic fallback from HF to local files."""
75
- print("\n🔄 TESTING AUTO FALLBACK (HF -> LOCAL)")
76
- print("=" * 50)
77
-
78
- try:
79
- # Create loader with HF enabled but potentially failing
80
- loader = AgriculturalDataLoader(
81
- data_path="/Users/tracyandre/Downloads/OneDrive_1_9-17-2025",
82
- dataset_id="nonexistent-dataset", # This should fail
83
- use_hf=True
84
- )
85
-
86
- # This should try HF first, then fallback to local
87
- df = loader.load_all_files()
88
-
89
- print(f"✅ Auto fallback successful!")
90
- print(f"📊 Records: {len(df):,}")
91
- print("🔄 Successfully fell back to local files after HF failure")
92
-
93
- return True
94
-
95
- except Exception as e:
96
- print(f"❌ Auto fallback failed: {e}")
97
- return False
98
-
99
- def test_data_source_switching():
100
- """Test switching between data sources."""
101
- print("\n🔀 TESTING DATA SOURCE SWITCHING")
102
- print("=" * 50)
103
-
104
- try:
105
- # Create loader
106
- loader = AgriculturalDataLoader(
107
- data_path="/Users/tracyandre/Downloads/OneDrive_1_9-17-2025",
108
- dataset_id="HackathonCRA/2024"
109
- )
110
-
111
- # Test local loading
112
- loader.set_data_source(use_hf=False)
113
- df_local = loader.load_all_files()
114
- print(f"📁 Local: {len(df_local):,} records")
115
-
116
- # Test switching to HF (if token available)
117
- if os.environ.get("HF_TOKEN"):
118
- loader.set_data_source(use_hf=True)
119
- try:
120
- df_hf = loader.load_all_files()
121
- print(f"🤗 HF: {len(df_hf):,} records")
122
-
123
- # Compare data
124
- if len(df_local) == len(df_hf):
125
- print("✅ Data consistency: Same number of records")
126
- else:
127
- print(f"⚠️ Data difference: Local={len(df_local)}, HF={len(df_hf)}")
128
-
129
- except Exception as e:
130
- print(f"🤗 HF loading failed (expected): {e}")
131
- else:
132
- print("⚠️ No HF_TOKEN - skipping HF test")
133
-
134
- return True
135
-
136
- except Exception as e:
137
- print(f"❌ Data source switching failed: {e}")
138
- return False
139
-
140
- def main():
141
- """Run all tests."""
142
- print("🚜 AGRICULTURAL DATA LOADER TESTING")
143
- print("=" * 60)
144
- print()
145
-
146
- results = []
147
-
148
- # Test 1: Local loading
149
- results.append(("Local Loading", test_local_loading()))
150
-
151
- # Test 2: Hugging Face loading
152
- results.append(("HF Loading", test_hf_loading()))
153
-
154
- # Test 3: Auto fallback
155
- results.append(("Auto Fallback", test_auto_fallback()))
156
-
157
- # Test 4: Data source switching
158
- results.append(("Source Switching", test_data_source_switching()))
159
-
160
- # Summary
161
- print("\n📋 TEST SUMMARY")
162
- print("=" * 30)
163
-
164
- passed = 0
165
- for test_name, result in results:
166
- status = "✅ PASS" if result else "❌ FAIL"
167
- print(f"{test_name:<20} {status}")
168
- if result:
169
- passed += 1
170
-
171
- print(f"\n🎯 Results: {passed}/{len(results)} tests passed")
172
-
173
- if passed == len(results):
174
- print("🎉 All tests passed! Data loader is working correctly.")
175
- else:
176
- print("⚠️ Some tests failed. Check the output above for details.")
177
-
178
- print("\n💡 Usage Examples:")
179
- print("# Load from local files:")
180
- print("loader = AgriculturalDataLoader.create_local_loader('/path/to/data')")
181
- print()
182
- print("# Load from Hugging Face:")
183
- print("loader = AgriculturalDataLoader.create_hf_loader('HackathonCRA/2024')")
184
- print()
185
- print("# Auto-detect with fallback:")
186
- print("loader = AgriculturalDataLoader(use_hf=True)")
187
- print("df = loader.load_all_files() # Tries HF first, falls back to local")
188
-
189
- if __name__ == "__main__":
190
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
test_hf_only.py DELETED
@@ -1,155 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Test script to validate Hugging Face only loading.
4
- """
5
-
6
- import os
7
- import warnings
8
- warnings.filterwarnings('ignore')
9
-
10
- def test_hf_only_loading():
11
- """Test that the loader only works with Hugging Face."""
12
- print("🤗 TESTING HUGGING FACE ONLY LOADING")
13
- print("=" * 50)
14
-
15
- from data_loader import AgriculturalDataLoader
16
-
17
- # Check if HF token is available
18
- hf_token = os.environ.get("HF_TOKEN")
19
- if not hf_token:
20
- print("⚠️ No HF_TOKEN found in environment variables")
21
- print("💡 Set HF_TOKEN to test Hugging Face loading")
22
- print("🔧 For this test, we'll try without token (may fail)")
23
-
24
- try:
25
- # Create loader (HF only)
26
- loader = AgriculturalDataLoader(
27
- dataset_id="HackathonCRA/2024",
28
- hf_token=hf_token
29
- )
30
-
31
- print(f"🤗 Attempting to load from dataset: {loader.dataset_id}")
32
-
33
- # Load data
34
- df = loader.load_all_files()
35
-
36
- print(f"✅ Success! Loaded {len(df):,} records from Hugging Face")
37
- print(f"📊 Years: {sorted(df['year'].unique())}")
38
- print(f"🌱 Crops: {df['crop_type'].nunique()}")
39
- print(f"📍 Plots: {df['plot_name'].nunique()}")
40
- print(f"💊 Herbicide applications: {df['is_herbicide'].sum()}")
41
-
42
- return True
43
-
44
- except Exception as e:
45
- print(f"❌ Failed to load from Hugging Face: {e}")
46
- print("💡 This is expected if the dataset doesn't exist yet")
47
- print("🔧 Make sure to upload your dataset to HF Hub first")
48
- return False
49
-
50
- def test_no_local_fallback():
51
- """Test that there's no local fallback."""
52
- print("\n🚫 TESTING NO LOCAL FALLBACK")
53
- print("=" * 50)
54
-
55
- from data_loader import AgriculturalDataLoader
56
-
57
- try:
58
- # Create loader with non-existent dataset
59
- loader = AgriculturalDataLoader(
60
- dataset_id="nonexistent/dataset"
61
- )
62
-
63
- # This should fail without falling back to local
64
- df = loader.load_all_files()
65
-
66
- print(f"❌ Unexpected success - loaded {len(df)} records")
67
- print("⚠️ This suggests local fallback is still active")
68
- return False
69
-
70
- except Exception as e:
71
- print(f"✅ Expected failure: {e}")
72
- print("✅ Confirmed: No local fallback, HF only")
73
- return True
74
-
75
- def test_simple_usage():
76
- """Test simple usage pattern."""
77
- print("\n📝 SIMPLE USAGE EXAMPLE")
78
- print("=" * 50)
79
-
80
- print("💡 Recommended usage pattern:")
81
- print()
82
-
83
- usage_code = '''
84
- from data_loader import AgriculturalDataLoader
85
-
86
- # Simple HF-only loader
87
- loader = AgriculturalDataLoader(dataset_id="HackathonCRA/2024")
88
-
89
- # Load data (will use HF_TOKEN from environment)
90
- df = loader.load_all_files()
91
-
92
- # Analyze data
93
- print(f"Loaded {len(df)} records from Hugging Face")
94
- '''
95
-
96
- print(usage_code)
97
-
98
- try:
99
- from data_loader import AgriculturalDataLoader
100
- loader = AgriculturalDataLoader(dataset_id="HackathonCRA/2024")
101
- print("✅ Loader created successfully")
102
- print(f"🎯 Target dataset: {loader.dataset_id}")
103
- print(f"🔑 Using token: {'Yes' if loader.hf_token else 'No (from env)'}")
104
-
105
- return True
106
-
107
- except Exception as e:
108
- print(f"❌ Failed to create loader: {e}")
109
- return False
110
-
111
- def main():
112
- """Run all tests."""
113
- print("🚜 HUGGING FACE ONLY - VALIDATION TESTS")
114
- print("=" * 60)
115
- print()
116
-
117
- results = []
118
-
119
- # Test 1: HF loading
120
- results.append(("HF Only Loading", test_hf_only_loading()))
121
-
122
- # Test 2: No local fallback
123
- results.append(("No Local Fallback", test_no_local_fallback()))
124
-
125
- # Test 3: Simple usage
126
- results.append(("Simple Usage", test_simple_usage()))
127
-
128
- # Summary
129
- print("\n📋 TEST SUMMARY")
130
- print("=" * 30)
131
-
132
- passed = 0
133
- for test_name, result in results:
134
- status = "✅ PASS" if result else "❌ FAIL"
135
- print(f"{test_name:<20} {status}")
136
- if result:
137
- passed += 1
138
-
139
- print(f"\n🎯 Results: {passed}/{len(results)} tests passed")
140
-
141
- if passed >= 2: # Allow HF loading to fail if dataset doesn't exist
142
- print("🎉 Validation successful! Loader is HF-only.")
143
- else:
144
- print("⚠️ Validation issues detected.")
145
-
146
- print("\n🚀 DEPLOYMENT CHECKLIST:")
147
- print("✅ Remove local file dependencies")
148
- print("✅ HF-only data loading")
149
- print("✅ No fallback mechanisms")
150
- print("🔲 Upload dataset to HF Hub")
151
- print("🔲 Set HF_TOKEN in production")
152
- print("🔲 Test with real HF dataset")
153
-
154
- if __name__ == "__main__":
155
- main()