Research Collaboration ML Matching API

A Python and FastAPI matching service case study used by Fusio(n) for researcher-to-researcher matching, combining transformer embeddings, structured profile categories, bidirectional scoring, caching, and accelerated inference.

Technical focus: ML service architecture, matching logic, and integration perspective
Year: 2026
Stack: Python, FastAPI, PyTorch, Transformers, SPECTER2, all-MiniLM-L6-v2, SciBERT, Docker, CUDA, Pydantic

Architecture diagram

Dedicated ML matching service architecture

The service sits behind Fusio(n) as a server-to-server FastAPI boundary for model loading, structured profile normalization, bidirectional scoring, workgroup creation, caching, and accelerated inference.

flowchart LR
  classDef caller fill:#dceeff,stroke:#4338ca,color:#151315,stroke-width:1.5px
  classDef api fill:#fee2d2,stroke:#c2410c,color:#151315,stroke-width:1.5px
  classDef engine fill:#dff3e7,stroke:#087f8c,color:#151315,stroke-width:1.5px
  classDef runtime fill:#fff7db,stroke:#b7791f,color:#151315,stroke-width:1.5px

  subgraph caller["Product caller"]
    direction TB
    fusio["Fusio(n) Laravel app<br/>ML Flow / Team Science"]
    payload["Structured profile payload<br/>domain / topics / methods / goals"]
  end

  subgraph boundary["FastAPI boundary"]
    direction TB
    endpoints["Endpoint families<br/>matching / workgroups / status / health"]
    schemas["Pydantic models<br/>requests / responses / validation"]
    jobs["BackgroundTasks<br/>large workgroup jobs"]
  end

  subgraph engine["Matching engine"]
    direction TB
    matcher["MatcherService<br/>model loading / scoring orchestration"]
    models["Embedding models<br/>SciBERT / SPECTER2 / MiniLM"]
    scoring["Bidirectional scoring<br/>category breakdowns / title fit"]
    groups["Workgroup algorithms<br/>k-means / deterministic / genetic"]
  end

  subgraph runtime["Runtime and cache"]
    direction TB
    device["Device manager<br/>CUDA / MPS / CPU fallback"]
    redis[("Redis cache<br/>embeddings / match results / job state")]
    deploy["Deployment paths<br/>Docker CPU/GPU / Modal"]
  end

  fusio --> payload
  payload -->|"server-to-server JSON"| endpoints
  endpoints --> schemas
  schemas --> matcher
  endpoints --> jobs
  jobs --> matcher
  matcher --> models
  matcher --> scoring
  matcher --> groups
  models --> device
  matcher --> redis
  jobs --> redis
  device --> deploy
  redis --> endpoints
  scoring -->|"ranked matches"| fusio
  groups -->|"workgroup results / job IDs"| fusio

  class fusio,payload caller
  class endpoints,schemas,jobs api
  class matcher,models,scoring,groups engine
  class device,redis,deploy runtime

The diagram separates current local FastAPI code from Modal deployment paths where broad MiniLM matching is implemented.
The public page describes service boundaries without publishing private production URLs or complete payload contracts.

Impact

Implemented a companion ML matching service that Fusio(n) can call for collaborator ranking and workgroup formation instead of keeping all matching logic inside Laravel.
Implemented multiple matching paths: legacy SciBERT matching, SPECTER2 technical matching, fast structured category matching, and all-MiniLM-L6-v2 broad matching.
Documented performance and runtime considerations across CPU/GPU execution, embedding cache reuse, batch inference, and Laravel integration patterns without exposing private integration details.

This ML API is the companion matching engine behind the Fusio(n) research collaboration platform. Fusio(n) provides the product experience, profiles, teams, communication, and operations. This service handles the heavier model-based matching and workgroup logic behind a server-to-server boundary.

The reason to separate it from Fusio(n) is architectural: the matching workload has different runtime needs from a Laravel application. It loads transformer models, benefits from GPU acceleration, needs embedding caches, and has long-running workgroup jobs. Keeping it as a dedicated API makes the product platform cleaner while allowing the matching engine to evolve independently.

Product Relationship

Fusio(n) includes several discovery paths, including deterministic matching, semantic matching, team science, and an ML flow. This project is the ML flow service. Laravel sends user profile data to the API, receives ranked matches or workgroup results, and can use those results inside the Fusio(n) collaboration experience.

The repository includes Laravel integration examples for server-to-server calls. The public case study describes the integration shape without exposing private path names, access details, or complete payload contracts.

Service Interface

The service exposes a FastAPI application for research collaboration matching. Publicly relevant interface families include:

legacy user matching
ordered structured-profile matching
optimized scientific-profile matching
broader weighted ensemble matching
workgroup formation
asynchronous status reporting for larger workloads

The local FastAPI app initializes MatcherService during startup, loads the embedding models, stores the matcher on app state, and routes requests through typed Pydantic request and response models.

Matching Models

The service evolved through several model strategies:

SciBERT for the first research-text matching path and backward-compatible service behavior
SPECTER2 for scientific and technical document embeddings
all-MiniLM-L6-v2 for faster broad semantic matching
TF-IDF and overlap scoring for weighted ensemble behavior in the broader matching path

SPECTER2 is used where scientific and technical context matters. The optimized path embeds structured categories separately, then combines category-level similarities with a weighted scoring design that favors topical and domain alignment while still preserving method, goal, and location relevance.

Scoring Design

The user matching logic is bidirectional. It does not only ask whether one user matches another. It scores how well the target user’s search profile matches the candidate’s expertise and how well the candidate’s search profile matches the target user’s expertise.

The direct matching path uses:

primary score: target search -> candidate expertise
secondary score: candidate search -> target expertise
title or seniority compatibility
phrase overlap
category alignment
optional description similarity

The SPECTER2 fast path keeps the same directionality but computes category-specific embeddings in batches. It also applies a domain/topic pre-filter so a target user does not need to compare against every possible candidate when a clear domain boundary exists.

This makes the service more practical for Fusio(n): match results can include a final score, directional scores, category breakdowns, description similarity, and total processing time rather than just a black-box nearest-neighbor result.

Structured Profile Contract

The API supports both legacy comma-separated keyword strings and structured profile categories. The structured contract is important because Fusio(n) can collect profile intent in a more useful shape than a single keyword blob.

Structured profiles include:

domain
topics
methods
locations
goals
optional narrative description

The Pydantic validators normalize structured profile data so the matching layer receives clean ordered category arrays even when upstream form data is inconsistent.

Workgroup Creation

The API also supports workgroup creation. It accepts participant and target-topic data, then returns suggested groups according to the selected matching strategy.

The public contract supports deterministic, genetic, and k-means modes. Larger workloads can be processed asynchronously, with progress tracked behind the service boundary instead of blocking the request lifecycle.

The workgroup paths include scientific and broader semantic variants so teams can be shaped around technical cohesion or collaboration diversity depending on the product need.

Runtime and Caching

The runtime is built around model loading and embedding reuse:

PyTorch chooses CUDA, Apple MPS, or CPU based on the available device.
embedding and match-result reuse reduces repeated model work
larger jobs use background processing
batch sizes are configurable for general embeddings and category embeddings

The service degrades to CPU when GPU is unavailable. That matters for local development and lower-cost validation, while accelerated runtimes can be used for heavier inference.

Deployment Shape

The project includes multiple deployment shapes:

local CPU execution for validation
GPU-oriented container execution for heavier inference
preloaded model layers to reduce cold-start cost
controlled production service boundaries
cache observability without publishing private management paths

The accelerated deployment shape consolidates model usage into one shared service boundary while preserving the API interface.

Laravel Integration

The repository includes Laravel examples for both direct HTTP usage and service-class usage. A typical Fusio(n)-style integration:

loads the target matching context from Laravel
collects all other candidate users
maps users into API payload fields
applies the returned ranked matches inside the product

That contract makes the ML service a product-facing dependency rather than an isolated experiment.

Verification and Reliability

The repo includes tests for:

payload shape and response structure
enhanced semantic matching behavior
profile category parsing
accelerated and local runtime behavior

It also includes documentation for scoring calculation, runtime optimization, and integration usage. Those artifacts show the operational side of the work: not only designing the scoring logic, but making it deployable, cacheable, testable, and understandable to the Laravel side.

Technical Perspective

This case study demonstrates ML service architecture around a real product integration: Python API design, transformer model selection, bidirectional matching logic, structured profile contracts, batching, caching, GPU/CPU runtime handling, containerized execution, Laravel integration, and technical documentation.