Search CORE

1,721,913 research outputs found

Roche, Mathieu

Author: Roche Mathieu
Publication venue
Publication date: 17/03/2016
Field of study

Banwa Publications (University of the Philippines Mindanao)

CRESI BigDataPol - Terrain Guadeloupe : Corpus

Author: Bonin Muriel
Roche Mathieu
Publication venue
Publication date: 16/01/2018
Field of study

CIRAD Dataverse

CRESI BigDataPol - Terrain Guadeloupe : Termes extraits automatiquement

Author: Bonin Muriel
Roche Mathieu
Publication venue
Publication date: 16/01/2018
Field of study

CONTEXTE : projet CRESI BigDataPol (http://textmining.biz/Projects/BigDataPol). But : Mobiliser des approches de Big data pour l'analyse des processus et des effets des politiques publiques dans le milieu rural. Question de recherche en SHS adaptée au terrain Guadeloupe : (1) Participation citoyenne dans le processus de construction des politiques agri-environnementales. (2) Opinions de différents acteurs sur les processus d’élaboration et les effets des politiques publiques relatives à l’agroécologie en Guadeloupe. Choix d’un objet de recherche (terrain Guadeloupe) dans le cadre du Projet CRESI BigDataPol : Exemple des controverses au sujet des traitements aériens contre la cercosporiose des bananiers (car contestation citoyenne et succession d’interdiction/dérogation fruit d’un rapport de force entre société civile et producteurs de banane) en Guadeloupe. TERMES EXTRAITS : Termes automatiquement identifiés avec BioTex avec une extraction et un classement selon 4 stratégies sur 2 corpus : (1) all : termes simples et composés, (2) multi : termes composés, (3) C-value : classement qui peut privilégier les termes les plus longs, (4) F_TFIDF_C : classement qui prend en compte une notion de discriminance. Le données contiennent également une sélection des 100 premiers termes retournés par chaque mesure/stratégie associés à chaque corpus (fichiers termes_corpus1_BigDataPol_09122017.txt et termes_corpus2_BigDataPol_09122017.txt) Les deux corpus utilisés pour effectuer cette extraction de la terminologie sont : (1) corpus 1 : Corpus de textes issus de la « Société civile » (associations de protection de l’environnement et LKP), (2) Corpus 2 : corpus de textes issus du « Groupement des producteurs de banane de Guadeloupe »

CIRAD Dataverse

STAR-FARM - Workshop Data: Co-construction of datasets dealing with agroecology practices in the Mekong Delta

Author: Ma Thanh
Roche Mathieu
Ducrot Raphaëlle
Publication venue
Publication date: 19/12/2025
Field of study

The STAR-FARM Workshop, held on October 22–23, 2025, at Can Tho University, Vietnam, marked a key milestone in advancing international collaboration at the crossroads of agroecology, artificial intelligence (AI), and participatory science. Jointly organized by CIRAD, IRD, FAO, and Can Tho University, the event gathered experts from Southeast Asia and Europe to co-construct multilingual datasets and lexicons capturing agroecological practices and innovation processes in the Mekong Delta. Over two dynamic days, participants explored how AI-driven text mining and participatory methods could enhance access to and understanding of agricultural knowledge. Core activities included lexicon development, corpus annotation, evaluation of AI tools, and discussions on open-access strategies. Beyond its technical outputs, the workshop served as a vibrant forum for intercultural dialogue, debating key notions such as community-driven innovation, local knowledge systems, and the ethical use of AI in research. The workshop’s outcomes, particularly the creation of the STAR-FARM Lexicon and annotated corpora, lay the foundation for long-term cooperation, capacity building, and open science in the region. By blending technological innovation with participatory values, STAR-FARM exemplifies how AI can empower local communities and foster sustainable, inclusive agricultural transformation across the Mekong Delta and beyond

CIRAD Dataverse

Valorcarn-TETIS: Candidates for OTR (Ontological and Terminological Resource)

Author: Shrivastava Gaurav
Roche Mathieu
Teisseire Maguelonne
Publication venue
Publication date: 19/09/2017
Field of study

Text Mining: The different terms extracted by text-mining approaches are candidates for an OTR (Ontological and Terminological Resource) associated to Valorcarn Project. -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identification of process that enables to reduce losses and waste

CIRAD Dataverse

PRETORIA lexicon

Author: Helmer Thierry
Roche Mathieu
Martin Pierre
Publication venue
Publication date: 13/05/2022
Field of study

The Long-term EU-AU Research and Innovation Partnership for Food and Nutrition Security and Sustainable Agriculture (LEAP4FNSSA) is a Coordination and Support Action (CSA). The main objective of the project is to provide a tool for European and African institutions to engage in a Sustainable Partnership Platform for research and innovation on Food and Nutrition Security, and Sustainable Agriculture (FNSSA). Work Package 3 (WP3) of the project aims to provide the core information system for the partnership platform. In this context, the PRETORIA lexicon is proposed and integrated into the KEOPS software. The PRETORIA lexicon based on 8 concepts dealing with the food security domain is the result of a brainstorming organised in the context of a workshop organised in Pretoria

CIRAD Dataverse

Valorcarn-TETIS: Terms extracted with Rake

Author: Shrivastava Gaurav
Roche Mathieu
Teisseire Maguelonne
Publication venue
Publication date: 19/09/2017
Field of study

Text-Mining: Terms extracted with Rake tool (https://github.com/aneesha/RAKE) from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Mining of scientific documents for identification of process that enables to reduce losses and waste

CIRAD Dataverse

Valorcarn-TETIS: Semantic groups of terms

Author: Shrivastava Gaurav
Roche Mathieu
Teisseire Maguelonne
Publication venue
Publication date: 19/09/2017
Field of study

Text-Mining: The extracted terms are gathered according the head (first and last words) (e‧g. (1) food consumption / food pathogen / food preservation, (2) spoiled biltong / venison biltong / wet biltong, and so forth. -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identification of process that enables to reduce losses and waste

CIRAD Dataverse

KEOPS - LEAP4FNSSA - Output

Author: Chaminuka Petronella
Lutzeyer Hans-Jörg,
Rokka Susanna
Csorba Adam
Lindsten Agneta
Petithuguenin Philippe
Roche Mathieu
Weissteiner Christof
Joutsjoki Vesa
Helmer Thierry
Dimitriou Ioannis
Okalany Emmanuel
Carrasco Violeta
van Boheemen Peter
Lundén Tomas
Moephuli Shadrack
Martin Pierre
Plath Melissa
Publication venue
Publication date: 12/02/2021
Field of study

The Long-term EU-AU Research and Innovation Partnership for Food and Nutrition Security and Sustainable Agriculture (LEAP4FNSSA) is a Coordination and Support Action (CSA). The main objective of the project is to provide a tool for European and African institutions to engage in a Sustainable Partnership Platform for research and innovation on Food and Nutrition Security, and Sustainable Agriculture (FNSSA). Work Package 3 (WP3) of the project aims to provide the core information system for the partnership platform. These data are the results of three virtual workshop sessions organized on 23rd and 24th of November 2020 (1st and 2nd sessions), and 9th of December 2020 (3rd session). Topic of data: Evaluation of KEOPS output Data description: (1) Type of results and visualisations to propose for the KEOPS platform, (2) 7 contributions (WP1 + HLPD members) + 9 contributions (WP3 members)<br

CIRAD Dataverse

Novelty dataset (animal health, food security, climate change)

Author: Owuor Dickson
Roche Mathieu
Menya Edmond
Interdonato Roberto
Publication venue
Publication date: 05/05/2025
Field of study

This dataset contains sets of news article segments in English related to three domains namely animal health, food security and climate change and has been used to fine-tune and evaluate GPT3.5-turbo, GPT4o, DeepSeek-V3, DeBERTa, RoBERTa, BERT, EpidBioELECTRA and EpidGPT models for novelty detection tasks in the three domains. It is composed of 10,660 animal disease, 1,100 food security and 2,200 climate change article segments in csv format with information about the parent articles (segment, doc id, seg id, title, source url, publication date, article domain, article subdomain). Animal health domain is made up of 22 subdomains inclusive of article segments on Avian Influenza (AI) 2310, Highly Pathogenic Avian Influenza (HPAI) 3060, African Swine Fever (ASF) 1000, Foot-and-Mouth disease (FMD) 1165, Bovine Spongiform Encephalopathy (BSE) 770, Brucellosis 435, Peste des Petits Ruminants (PPR) 160, Bluetongue 165, Newcastle disease 155, Glanders 160, Disease X 140, Anthrax 120, West Nile Virus (WNV) 145, Middle East respiratory syndrome (MERS) 215, Infectious Salmon Anaemia (ISA) 110, Equine Influenza (EI) 200, Eastern Equine Encephalitis (EEE) 50, Porcine Reproductive and Respiratory Syndrome (PRRS) 85, Rift valley Fever (RVF) 30, Classical Swine Fever (CSF) 40, Rabies 30, Venezuelan Equine Encephalitis (VEE) 55, Viral Haemorrhagic Septicaemia (VHS) 15. Climate change domain is made up of 7 subdomains inclusive of article segments on flash floods 340, drought 394, wildfires 184, hurricanes 349, heatwaves 385, global warming 299 and tsunamis 340. The original articles dataset (corpus) contains documents from which created segments (drawn from original PADI-Web articles (relevant articles only) and those gotten from GDELT database (for news articles on food security events and climate change events))

CIRAD Dataverse