~/profile · open

01 · introduction

Portrait of Paulo Soares
Paulo Soares

Paulo Soares

Senior ML Engineer · Ph.D.

I'm a Senior Machine Learning Engineer at Pinterest. I ship large-scale ranking and representation systems (embeddings, ads, and feed models) with measurable business impact (CTR, CPC, engagement) and disciplined experimentation. Background spans production ML at Pinterest, research internships, LLM-era prototyping, and years of high-stakes software engineering in energy.

zsh · manifesto

cat manifesto.txt

# production wisdom · v1

Measure before you scale; document the experiment; ship what survives contact with production.

02 · chronology

Experience

  1. Sr. Machine Learning Engineer

    Pinterest

    • Optimized large embedding table model through feature expansion and ablation studies; launched to production with −1.34% CPC, +2.6% CTR, and six-figures annual infra savings.
    • Built end-to-end pre-trained entity embedding system adopted into production Home Feed model, improving engagement by +4.6% and session success by +0.24% with neutral latency.
    • Integrated cross-domain embeddings into ads ranking model, driving −0.54% CPC, +1.32% CTR, and seven-figures annual infra savings through feature optimization.
    • Created ML experimentation guidelines and delivered technical deep dives, establishing best practices for model development and production maintenance.
  2. Machine Learning Intern

    Tetricus Labs

    • Prototyped synthetic conversational data generation using LLM-empowered agents and fine-tuned models for realistic, stylistically tailored outputs; drafted scientific manuscript for publication.
  3. Pinterest Labs Research Intern

    Pinterest

    • Developed end-to-end model distillation and domain adaptation solutions for large-scale ad ranking systems, applying statistical and deep learning techniques to reduce bias.
    • Designed and implemented big data pipelines for daily model training, inference, and reporting using distributed computing frameworks.
    • Created technical documentation, design specs, and onboarding tutorials that accelerated team development velocity and supported knowledge transfer.
  4. Software Engineer

    Petrobras

    • Led $5M project to expedite maintenance and drilling operations planning, delivering expected $15M annual savings.
    • Engineered ship usage optimization system, increasing operational efficiency by 15% through reduced downtime.

03 · formation

Education

  • Ph.D. in Computer Science (Machine Learning)

    University of Arizona

    Dissertation (ProQuest)

  • B.S. & M.S. in Computer Science (Machine Learning)

    Universidade Federal de Pernambuco

04 · themes

Interests & research

Ads & feed ranking

embeddings, cross-domain transfer, production constraints

Representation & distillation

domain adaptation, bias reduction at scale

Foundational models

pre-training objectives, scaling, adaptation to ranking & retrieval

LLMs & agents

synthetic data, fine-tuning, tool use, research writing

Causal & experimental ML

rigorous eval, guidelines, deep dives

05 · papers

Papers

  1. 2025[01]

    Decoupled Entity Representation Learning for Pinterest Ads Ranking

    ACM Digital Library

  2. 2024[02]

    Probabilistic modeling of interpersonal coordination processes

    ICML (Proceedings of the 41st International Conference on Machine Learning)

  3. 2023[03]

    The ToMCAT Dataset

    NeurIPS (Datasets & Benchmarks)

  4. 2021[04]

    Probabilistic Modeling of Human Teams to Infer False Beliefs

    AAAI Fall Symposium

  5. 2013[05]

    Proximity measures for link prediction based on temporal events

    Expert Systems with Applications

  6. 2012[06]

    Time Series Based Link Prediction

    IJCNN

Also indexed on ORCID.

06 · toolkit

Skills

How I work and how I ship.

Collaboration & craft

  • Collaborative teammate
  • Critical thinker
  • Attention to detail
  • Persistent problem solver
  • Asks incisive questions
  • Rigorous analysis
  • Parallel workstreams
  • Knowledge sharing & mentoring

Engineering & delivery

  • End-to-end training pipelines
  • Distributed training & batch jobs
  • Offline & online evaluation
  • Experiment design & A/B tests
  • GPU / memory-aware debugging
  • Production monitoring & triage
  • Design docs & technical writing