Filippo Pellegrino
Computational Linguist · NLP Researcher
Research Profile
Computational linguist with expertise in natural language processing and machine learning. Contributed to advancing automatic assessment techniques for student writing and developed conversational AI systems for cultural heritage applications. Work bridges theoretical computational linguistics with practical implementations, combining deep learning with traditional linguistic analysis to create interpretable NLP solutions.
Academic & Professional Experience
Eurac Research · Bolzano, IT
- Designed and developed an automated EPUB-to-Markdown conversion pipeline for multilingual document processing
- Constructed a multilingual benchmark dataset (Italian, German) for image-to-Markdown document understanding tasks using Project Gutenberg
- Implemented a VLM evaluation framework with NED, BLEU, Structure F1, and BERTScore metrics
- Benchmarked state-of-the-art vision-language models (Qwen2.5-VL, Pixtral) on page-image-to-Markdown generation
Compet-e Srl · Italy
- Designed and implemented AI/NLP architectures for industrial applications
- Developed REST API systems for natural language processing tasks
- Implemented local LLM deployment strategies using the Ollama framework
- Developed NLP-powered web applications
Università di Napoli L'Orientale · Naples, IT
- Principal researcher on the 'Testa di Marianna' project for interactive cultural heritage applications
- Collaborated with Unior NLP group and 'Casa delle Tecnologie Emergenti'
- Developed conversational AI systems for cultural heritage display
- Researched open-source LLM adaptation and fine-tuning (LLaMA family)
- Curated specialised datasets for Naples-specific cultural content
Eurac Research · Bolzano, IT
- Principal researcher in ITACA project final phase for automated essay evaluation
- Developed coherence evaluation methodology for context-specific texts
- Implemented BERT-based models for educational text analysis
- Co-authored peer-reviewed publication on automatic coherence evaluation
- Presented research findings at CliC-it 2024 conference (poster)
Eurac Research · Bolzano, IT
- Participated in EVALITA 2023 shared tasks (DisCoTex, LangLearn)
- Developed machine learning models for language learning assessment
- Co-authored research paper on interpretable language learning evaluation
Education
Free University of Bolzano · Bolzano, IT
Thesis: Local Coherence Modeling for the Italian Language
Supervisor: Prof. Luca Ducceschi
Focus: Computational Linguistics, NLP
University of Turin · Turin, IT
Thesis: Vida del Escudero Marcos de Obregon: oltre la picaresca
Supervisor: Prof. Veronica Orazi
Publications & Conferences
2024
Towards an Automatic Evaluation of (In)coherence in Student Essays
Pellegrino, F., Frey, J. C., & Zanasi, L.
Proceedings of CliC-it 2024
2023
bot.zen at LangLearn: regressing towards interpretability
Stemle, E. W., Tebaldini, M., Bonanni, F., Pellegrino, F., et al.
Proceedings of EVALITA 2023
Conference Presentations
CliC-it 2024: Automatic Coherence Evaluation in Educational Texts (Poster Presentation)
Technical Skills
Programming Languages: Python (Advanced), R (Basic), Bash/Linux (Basic), LaTeX
ML/NLP Frameworks: PyTorch, Hugging Face Transformers, BERT Models, Scikit-learn, OpenAI API, Ollama
Development Tools: Git/GitHub/GitLab, Flask, REST API Development, Vector Databases, Pandas, BeautifulSoup
Research Interests
Languages
Italian
Native
English
Advanced, C1
Spanish
Advanced, C1
German
Intermediate, B1