{"id":79532,"name":"S4A","description":"A Sentinel-2 multi-year, multi-country benchmark dataset for crop classification and segmentation with deep learning.","url":"https://github.com/orion-ai-lab/s4a","last_synced_at":"2026-04-17T23:30:19.559Z","repository":{"id":40548335,"uuid":"438341878","full_name":"Orion-AI-Lab/S4A","owner":"Orion-AI-Lab","description":"Sen4AgriNet: A Sentinel-2 multi-year, multi-country benchmark dataset for crop classification and segmentation with deep learning","archived":false,"fork":false,"pushed_at":"2024-11-06T10:02:04.000Z","size":7053,"stargazers_count":111,"open_issues_count":5,"forks_count":21,"subscribers_count":6,"default_branch":"main","last_synced_at":"2026-04-05T18:06:21.364Z","etag":null,"topics":["crop-classification","deep-learning","segmentation","sentinel-2"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Orion-AI-Lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-12-14T17:28:41.000Z","updated_at":"2026-03-20T11:52:59.000Z","dependencies_parsed_at":"2022-07-13T15:59:43.670Z","dependency_job_id":"7d85fcef-5ebe-4f00-8ad6-18535ef4f160","html_url":"https://github.com/Orion-AI-Lab/S4A","commit_stats":{"total_commits":17,"total_committers":3,"mean_commits":5.666666666666667,"dds":0.4117647058823529,"last_synced_commit":"2f1da433a56a361535e253bebbb0d09952caea51"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Orion-AI-Lab/S4A","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Orion-AI-Lab","download_url":"https://codeload.github.com/Orion-AI-Lab/S4A/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31695165,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T20:18:30.949Z","status":"ssl_error","status_checked_at":"2026-04-11T20:18:29.982Z","response_time":54,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"owner":{"login":"Orion-AI-Lab","name":"Orion Lab","uuid":"94176283","kind":"organization","description":"Orion Lab research group: Deep Learning in Earth Observation at the National Observatory of Athens","email":"ipapoutsis@noa.gr","website":null,"location":"Greece","twitter":null,"company":null,"icon_url":"https://avatars.githubusercontent.com/u/94176283?v=4","repositories_count":5,"last_synced_at":"2023-03-10T08:16:05.630Z","metadata":{"has_sponsors_listing":false},"html_url":"https://github.com/Orion-AI-Lab","funding_links":[],"total_stars":null,"followers":null,"following":null,"created_at":"2022-11-20T04:57:33.476Z","updated_at":"2023-03-10T08:16:05.639Z","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Orion-AI-Lab","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Orion-AI-Lab/repositories"},"packages":[],"commits":{"id":1359406,"full_name":"Orion-AI-Lab/S4A","default_branch":"main","total_commits":18,"total_committers":2,"total_bot_commits":0,"total_bot_committers":0,"mean_commits":9.0,"dds":0.11111111111111116,"past_year_total_commits":0,"past_year_total_committers":0,"past_year_total_bot_commits":0,"past_year_total_bot_committers":0,"past_year_mean_commits":0.0,"past_year_dds":0.0,"last_synced_at":"2026-04-15T22:30:59.859Z","last_synced_commit":"7ac66dce57abb6dd05683891bc810c2ddc36a966","created_at":"2023-09-13T13:25:05.855Z","updated_at":"2026-04-15T22:30:59.841Z","committers":[{"name":"masdra","email":"paren8esis@gmail.com","login":"paren8esis","count":16},{"name":"dimzog","email":"zog.dim3@gmail.com","login":"dimzog","count":2}],"past_year_committers":[],"commits_url":"https://commits.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A/commits","host":{"name":"GitHub","url":"https://github.com","kind":"github","last_synced_at":"2026-04-15T00:00:09.512Z","repositories_count":6213589,"commits_count":900137604,"contributors_count":34924064,"owners_count":1144686,"icon_url":"https://github.com/github.png","host_url":"https://commits.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://commits.ecosyste.ms/api/v1/hosts/GitHub/repositories"}},"issues_stats":{"full_name":"Orion-AI-Lab/S4A","html_url":"https://github.com/Orion-AI-Lab/S4A","last_synced_at":"2026-04-09T20:02:53.004Z","status":"error","issues_count":9,"pull_requests_count":1,"avg_time_to_close_issue":23721.75,"avg_time_to_close_pull_request":1288081.0,"issues_closed_count":4,"pull_requests_closed_count":1,"pull_request_authors_count":1,"issue_authors_count":7,"avg_comments_per_issue":1.777777777777778,"avg_comments_per_pull_request":0.0,"merged_pull_requests_count":1,"bot_issues_count":0,"bot_pull_requests_count":0,"past_year_issues_count":5,"past_year_pull_requests_count":0,"past_year_avg_time_to_close_issue":7062.666666666667,"past_year_avg_time_to_close_pull_request":null,"past_year_issues_closed_count":3,"past_year_pull_requests_closed_count":0,"past_year_pull_request_authors_count":0,"past_year_issue_authors_count":3,"past_year_avg_comments_per_issue":0.4,"past_year_avg_comments_per_pull_request":null,"past_year_bot_issues_count":0,"past_year_bot_pull_requests_count":0,"past_year_merged_pull_requests_count":0,"created_at":"2023-09-13T13:25:12.723Z","updated_at":"2026-04-09T20:02:53.005Z","repository_url":"https://issues.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A","issues_url":"https://issues.ecosyste.ms/api/v1/hosts/GitHub/repositories/Orion-AI-Lab%2FS4A/issues","issue_labels_count":{},"pull_request_labels_count":{},"issue_author_associations_count":{"NONE":9},"pull_request_author_associations_count":{"CONTRIBUTOR":1},"issue_authors":{"StorywithLove":2,"VSainteuf":2,"Spiruel":1,"PeterKKan":1,"Multihuntr":1,"nilsleh":1,"saqibzia-dev":1},"pull_request_authors":{"paren8esis":1},"host":{"name":"GitHub","url":"https://github.com","kind":"github","last_synced_at":"2026-04-16T00:00:09.014Z","repositories_count":14288487,"issues_count":34566667,"pull_requests_count":113100118,"authors_count":11236216,"icon_url":"https://github.com/github.png","host_url":"https://issues.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://issues.ecosyste.ms/api/v1/hosts/GitHub/repositories","owners_url":"https://issues.ecosyste.ms/api/v1/hosts/GitHub/owners","authors_url":"https://issues.ecosyste.ms/api/v1/hosts/GitHub/authors"},"past_year_issue_labels_count":{},"past_year_pull_request_labels_count":{},"past_year_issue_author_associations_count":{},"past_year_pull_request_author_associations_count":{},"past_year_issue_authors":{},"past_year_pull_request_authors":{},"maintainers":[],"active_maintainers":[]},"events":{"total":{"ForkEvent":3,"IssuesEvent":4,"WatchEvent":14,"IssueCommentEvent":1,"PushEvent":1},"last_year":{"ForkEvent":1,"WatchEvent":5}},"keywords":["crop-classification","deep-learning","segmentation","sentinel-2"],"dependencies":[],"score":5.44673737166631,"created_at":"2023-09-19T00:08:29.687Z","updated_at":"2026-04-17T23:30:19.561Z","avatar_url":"https://github.com/Orion-AI-Lab.png","language":"Jupyter Notebook","category":"Consumption","sub_category":"Agriculture and Nutrition","monthly_downloads":0,"total_dependent_repos":0,"total_dependent_packages":0,"readme":"## Sen4AgriNet\n\n#### A Sentinel-2 multi-year, multi-country benchmark dataset for crop classification and segmentation with deep learning\n\n\n**Contributors:** [Sykas D.](https://github.com/dimsyk), [Zografakis D.](https://github.com/dimzog), [Sdraka M.](https://github.com/paren8esis)\n\n---\n\n**Supplementary repo with DL experiments using the Sen4AgriNet dataset:** [Sen4AgriNet-Models](https://github.com/Orion-AI-Lab/S4A-Models).\n\n---\n\nThis repository provides a native PyTorch Dataset Class for Sen4AgriNet dataset (`patches_dataset.py`). Should work with any new version of PyTorch1.7.1+ and Python3.8.5+.\n\nDataset heavily relies on [cocoapi](https://github.com/cocodataset/cocoapi) for dataloading and indexing, therefore make sure you have it installed:\n```python\npip3 install pycocotools\n```\n\nThen make sure every other requirement is installed:\n```python\npip3 install -r requirements.txt\n```\n\n### Instructions\n\nIn order to use the provided PyTroch Dataset class, the required netCDF files of Sen4AgriNet must be downloaded and placed inside the `dataset/netcdf/` folder. These files are available for download at [Dropbox](https://www.dropbox.com/scl/fo/ne0dpq72gi3ayhqj0hg60/h?dl=0\u0026rlkey=b0148zl6yja7ph26bpfms6knt), [Google Drive](https://drive.google.com/drive/folders/1-qKhlaMUPPI7Th7xTE2vIXY2nIowrSiC?usp=sharing) and [HuggingFace Hub](https://huggingface.co/datasets/orion-ai-lab/S4A).\n\nThen, three separate COCO files must be created: one for training, one for validation and one for testing. Alternatively, the predefined COCO files for the 3 Scenarios can be downloaded from [here](https://www.dropbox.com/sh/kvgo4r2vin7sbwt/AACzDLNbnSouuZYMk8Y9I4sha?dl=0).\n\nAfter this initial setup, `patches_dataset.py` can be used in a PyTorch deep learning pipeline to load, prepare and return patches from the dataset according to the split dictated by the COCO files. This Dataset class has the following features:\n - Reads the netCDF files of the dataset containing the Sentinel-2 observations over time and the corresponding labels.\n - Isolates the Sentinel-2 bands requested by the user.\n - Computes the median Sentinel-2 image on a given frequency, e.g. monthly (or loads precomputed medians, if any).\n - Returns the timeseries of median images inside a predefined window.\n - Normalizes the images.\n - Returns hollstein masks for clouds, cirrus, shadow or snow.\n - Returns a parcel mask: 1 for parcel, 0 for non-parcel.\n - Can alternatively return binary labels: 1 for crops, 0 for non-crops.\n\n### Dataset exploration\n\nThis is roughly the way that our `patches_dataset.py` works. The whole procedure is also described in the provided [notebook](https://github.com/Orion-AI-Lab/S4A/blob/main/patch_aggregation_visualization.ipynb).\n\n1. Open a netCDF file for exploration.\n\n```python3\nimport netCDF4\nfrom pathlib import Path\n\npatch = netCDF4.Dataset(Path('data/2020_31TCG_patch_14_14.nc'), 'r')\npatch\n```\n\nOutputs\n```python3\n\"\"\"\n\u003cclass 'netCDF4._netCDF4.Dataset'\u003e\nroot group (NETCDF4 data model, file format HDF5):\n    title: S4A Patch Dataset\n    authors: Papoutsis I., Sykas D., Zografakis D., Sdraka M.\n    patch_full_name: 2020_31TCG_patch_14_14\n    patch_year: 2020\n    patch_name: patch_14_14\n    patch_country_code: ES\n    patch_tile: 31TCG\n    creation_date: 27 Apr 2021\n    references: Documentation available at .\n    institution: National Observatory of Athens.\n    version: 21.03\n    _format: NETCDF4\n    _nco_version: netCDF Operators version 4.9.1 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)\n    _xarray_version: 0.17.0\n    dimensions(sizes):\n    variables(dimensions):\n    groups: B01, B02, B03, B04, B05, B06, B07, B08, B09, B10, B11, B12, B8A, labels, parcels\n\"\"\"\n```\n2. Visualize a single timestamp.\n\n```python3\nimport xarray as xr\n\nband_data = xr.open_dataset(xr.backends.NetCDF4DataStore(patch['B02']))\nband_data.B02.isel(time=0).plot()\n```\n![Single Month](images/single_timestamp.png)\n\n\n3. Visualize the labels:\n\n```python3\nlabels = xr.open_dataset(xr.backends.NetCDF4DataStore(patch['labels']))\nlabels.labels.plot()\n```\n![Labels](images/labels.png)\n\n4. Visualize the parcels:\n\n```python3\nparcels = xr.open_dataset(xr.backends.NetCDF4DataStore(patch['parcels']))\nparcels.parcels.plot()\n```\n![Parcels](images/parcels.png)\n\n5. Plot the median of observations for each month:\n\n```python3\nimport pandas as pd\n# Or maybe aggregate based on a given frequency\n# Refer to\n# https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases\ngroup_freq = '1MS'\n\n# Grab year from netcdf4's global attribute\nyear = patch.patch_year\n\n# output intervals\ndate_range = pd.date_range(start=f'{year}-01-01', end=f'{int(year) + 1}-01-01', freq=group_freq)\n\n# Aggregate based on given frequency\nband_data = band_data.groupby_bins(\n    'time',\n    bins=date_range,\n    right=True,\n    include_lowest=False,\n    labels=date_range[:-1]\n).median(dim='time')\n```\n\nIf you plot right now, you might notice that some months are empty:\n![Single Month](images/per_month_empty.png)\n\n(Optional) Fill in empty months:\n\n```python3\nimport matplotlib.pyplot as plt\n\nband_data = band_data.interpolate_na(dim='time_bins', method='linear', fill_value='extrapolate')\n\nfig, axes = plt.subplots(nrows=3, ncols=4, figsize=(18, 12))\n\nfor i, season in enumerate(band_data.B02):\n\n    ax = axes.flat[i]\n    cax = band_data.B02.isel(time_bins=i).plot(ax=ax)\n\n\nfor i, ax in enumerate(axes.flat):\n    ax.axes.get_xaxis().set_ticklabels([])\n    ax.axes.get_yaxis().set_ticklabels([])\n    ax.axes.axis('tight')\n    ax.set_xlabel('')\n    ax.set_ylabel('')\n    ax.set_title(f'Month: {i+1}')\n\n\nplt.tight_layout()\nplt.show()\n```\n![Per Month](images/per_month.png)\n\n### PatchesDataset usage example\n\nPlease refer to the provided [notebook](https://github.com/Orion-AI-Lab/S4A/blob/main/s4a-dataloaders.ipynb) for a detailed usage example of the provided `PatchesDataset`.\n\n1. Read the COCO file to be used.\n```python3\nfrom pathlib import Path\nfrom pycocotools.coco import COCO\nroot_path_coco = Path('coco_files/')\ncoco_train = COCO(root_path_coco / 'coco_example.json')\n```\n\n2. Initialize the PatchesDataset.\n```python3\nfrom torch.utils.data import DataLoader\nfrom patches_dataset import PatchesDataset\nfrom utils.config import LINEAR_ENCODER\nroot_path_netcdf = Path('dataset/netcdf')  # Path to the netCDF files\ndataset_train = PatchesDataset(root_path_netcdf=root_path_netcdf,\n                               coco=coco_train,\n                               group_freq='1MS',\n                               prefix='test_patchesdataset',\n                               bands=['B02', 'B03', 'B04'],\n                               linear_encoder=LINEAR_ENCODER,\n                               saved_medians=False,\n                               window_len=6,\n                               requires_norm=False,\n                               return_masks=False,\n                               clouds=False,\n                               cirrus=False,\n                               shadow=False,\n                               snow=False,\n                               output_size=(183, 183)\n                              )\n```\n\n3. Initialize the Dataloader.\n```python3\ndataloader_train = DataLoader(dataset_train,\n                              batch_size=1,\n                              shuffle=True,\n                              num_workers=4,\n                              pin_memory=True\n                             )\n```\n\n4. Get a batch.\n```python3\nbatch = next(iter(dataloader_train))\n```\n\nThe `batch` variable is a dictionary containing the keys: `medians`, `labels`, `idx`.\n`batch['medians']` contains a pytorch tensor of size `[1, 6, 3, 183, 183]` where:\n  - batch size: 1\n  - timestamps: 6\n  - bands: 3\n  - height: 183\n  - width: 183\n\n![Batch Medians](images/batch_medians.png)\n\n`batch['labels']` contains the corresponding labels of the medians, which is a pytorch tensor of size `[1, 183, 183]` where:\n  - batch size: 1\n  - height: 183\n  - width: 183\n\n![Batch Labels](images/batch_labels.png)\n\n`batch['idx']` contains the index of the returned timeseries.\n\n### Webpage\n\nDataset Webpage: https://www.sen4agrinet.space.noa.gr/\n\n### Experiments\n\nPlease visit [Sen4AgriNet-Models](https://github.com/Orion-AI-Lab/S4A-Models) for a complete experimentation pipeline using the Sen4AgriNet dataset.\n\n### Citation\n\nTo cite please use:\n```\n@ARTICLE{\n  9749916,\n  author={Sykas, Dimitrios and Sdraka, Maria and Zografakis, Dimitrios and Papoutsis, Ioannis},\n  journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},\n  title={A Sentinel-2 multi-year, multi-country benchmark dataset for crop classification and segmentation with deep learning},\n  year={2022},\n  doi={10.1109/JSTARS.2022.3164771}\n}\n```\n","funding_links":[],"readme_doi_urls":[],"works":{},"citation_counts":{},"total_citations":0,"keywords_from_contributors":[],"project_url":"https://ost.ecosyste.ms/api/v1/projects/79532","html_url":"https://ost.ecosyste.ms/projects/79532"}