Site maintenance is in progress. Some features may be temporarily unavailable.
We contribute to the broader machine learning community through open source projects.
CLI-based playbook for evaluating LLMs with MLflow logging. Supports multiple providers including Azure OpenAI, Ollama, OpenRouter, and Alibaba Cloud. Enables testing a model under evaluation against a configurable judge model.
Multi-structured financial document question answering using RAG. Extracts text, tables, and figures from financial report PDFs via Azure Document Intelligence, with LangChain-based chunking and Azure OpenAI embeddings. Associated with the ECIR 2026 paper "Understanding Multi-Structured Documents via LLMs."