AboutResearchServicesProjectsCVContact

Filippo Pellegrino

Computational Linguist · NLP Researcher

Research Profile

Computational linguist with expertise in natural language processing and machine learning. Contributed to advancing automatic assessment techniques for student writing and developed conversational AI systems for cultural heritage applications. Work bridges theoretical computational linguistics with practical implementations, combining deep learning with traditional linguistic analysis to create interpretable NLP solutions.

Academic & Professional Experience

Research CollaboratorFeb 2026 – Mar 2026

Eurac Research · Bolzano, IT

  • Designed and developed an automated EPUB-to-Markdown conversion pipeline for multilingual document processing
  • Constructed a multilingual benchmark dataset (Italian, German) for image-to-Markdown document understanding tasks using Project Gutenberg
  • Implemented a VLM evaluation framework with NED, BLEU, Structure F1, and BERTScore metrics
  • Benchmarked state-of-the-art vision-language models (Qwen2.5-VL, Pixtral) on page-image-to-Markdown generation
NLP Engineer InternMay 2025 – Aug 2025

Compet-e Srl · Italy

  • Designed and implemented AI/NLP architectures for industrial applications
  • Developed REST API systems for natural language processing tasks
  • Implemented local LLM deployment strategies using the Ollama framework
  • Developed NLP-powered web applications
Junior Research AssistantJul 2024 – Feb 2025

Università di Napoli L'Orientale · Naples, IT

  • Principal researcher on the 'Testa di Marianna' project for interactive cultural heritage applications
  • Collaborated with Unior NLP group and 'Casa delle Tecnologie Emergenti'
  • Developed conversational AI systems for cultural heritage display
  • Researched open-source LLM adaptation and fine-tuning (LLaMA family)
  • Curated specialised datasets for Naples-specific cultural content
Research InternFeb 2024 – Jul 2024

Eurac Research · Bolzano, IT

  • Principal researcher in ITACA project final phase for automated essay evaluation
  • Developed coherence evaluation methodology for context-specific texts
  • Implemented BERT-based models for educational text analysis
  • Co-authored peer-reviewed publication on automatic coherence evaluation
  • Presented research findings at CliC-it 2024 conference (poster)
Research InternFeb 2023 – Mar 2023

Eurac Research · Bolzano, IT

  • Participated in EVALITA 2023 shared tasks (DisCoTex, LangLearn)
  • Developed machine learning models for language learning assessment
  • Co-authored research paper on interpretable language learning evaluation

Education

Master of Arts in Applied LinguisticsOct 2021 – Mar 2024

Free University of Bolzano · Bolzano, IT

Thesis: Local Coherence Modeling for the Italian Language

Supervisor: Prof. Luca Ducceschi

Focus: Computational Linguistics, NLP

Bachelor of Arts in Linguistic MediationOct 2017 – Jul 2021

University of Turin · Turin, IT

Thesis: Vida del Escudero Marcos de Obregon: oltre la picaresca

Supervisor: Prof. Veronica Orazi

Publications & Conferences

2024

Towards an Automatic Evaluation of (In)coherence in Student Essays

Pellegrino, F., Frey, J. C., & Zanasi, L.

Proceedings of CliC-it 2024

2023

bot.zen at LangLearn: regressing towards interpretability

Stemle, E. W., Tebaldini, M., Bonanni, F., Pellegrino, F., et al.

Proceedings of EVALITA 2023

Conference Presentations

CliC-it 2024: Automatic Coherence Evaluation in Educational Texts (Poster Presentation)

Technical Skills

Programming Languages: Python (Advanced), R (Basic), Bash/Linux (Basic), LaTeX

ML/NLP Frameworks: PyTorch, Hugging Face Transformers, BERT Models, Scikit-learn, OpenAI API, Ollama

Development Tools: Git/GitHub/GitLab, Flask, REST API Development, Vector Databases, Pandas, BeautifulSoup

Research Interests

Computational LinguisticsHuman-AI InteractionMultimodal ModelsLanguage Encoding & Information RetrievalEducational TechnologyExplainable AI

Languages

Italian

Native

English

Advanced, C1

Spanish

Advanced, C1

German

Intermediate, B1