FoodSafeSum
Authors/Creators
- Bakagianni, Juli (Data curator)
-
Randl, Korbinian Robert
(Data curator)1
-
Rocchietti, Guido
(Data curator)2
-
Rulli, Cosimo
(Project member)2
-
Nardini, Franco Maria
(Project member)2
-
Henriksson, Aron3
-
Trani, Salvatore
(Project member)2
- Romanova, Anna (Data collector)4
- Garcia, Mariano (Data collector)4
-
Pavlopoulos, John
(Data curator)5
Contributors
Contact person:
Data collector:
Research groups:
Description
FoodSafeSum is a machine-actionable dataset for NLP in food safety. It contains human-written and LLM-generated summaries and titles of 2,091 food-safety documents, plus manually curated topics, document types, and automatically extracted hazard annotations. Documents were gathered by SGS Digicomply from news, regulatory/legal sources, guidance portals, and scientific outlets (years 2002–2023; ~58% originally in English; the rest translated and curated). The dataset enables research on classification, retrieval, RAG-style QA, and event clustering in food-safety monitoring and policy. (See Section 3 and Table 7 for schema and fields; Figures 1–3 for source/type statistics. In the paper)
What’s included?
-
-
Manual summary and manual title (by domain experts)
-
LLM summary and LLM title (generated with meta.llama3-70b-instruct via Bedrock)
-
Document type (News, Regulation, Guidance, Scientific) and topic labels (12 high-level categories, e.g., Policies & Laws; Contaminants, residues & contact materials)
-
Hazard annotations auto-extracted from a controlled vocabulary derived from prior work
-
Source name and original title
For each source item:
-
-
Note: The full original documents are not included in the public release (used internally for analysis only).
Format & schema
-
Primary release as CSV/JSON with columns (see Table 7): manual_summary, manual_title, llama70b_summary, llama70b_title, source_name, doc_type, topics, plus hazards (list) and any auxiliary metadata used for experiments.
-
Multilingual inputs were translated (Google Translate/DeepL) and curated; see paper for details.
Files
fss.csv
Files
(2.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:6e6061b643e2628c674adb9e4845f364
|
2.9 MB | Preview Download |
Additional details
Dates
- Collected
-
2025-09-17