~/devreads

#machine learning

52 posts

2 Jun

Harrison Katz 11 min read

How Airbnb used sequential geographic recovery signals and prior propagation to generate reliable corridor-level forecasts when local data was scarce. By: Harrison Katz The problem with unprecedented shocks Almost every forecasting system is built on the same implicit assumption: the future will resemble the past. You train on historical data, you validate on holdout periods, and you trust that past…

technologymachine-learningdata-sciencedata-modelingforecasting

Criteo Tech 9 min read

Author: Paul Coursaux At Criteo, retail media is about helping brands reach shoppers directly on retailers’ property, right at the digital shelf where purchase decisions are made. Through CMAX , our unified retail media platform, we connect advertisers to retailers’ audiences with Sponsored Products that appear alongside native results in onsite search and browsing experiences. Sponsored products: Boost your brand…

aiadtechmachine-learningretail-mediasemantic-search

21 May

Pinterest Engineering 12 min read

Authors ( listed alphabetically ) Ads Feature Engineering Infra team: Ajay Venkatakrishnan, Le Zhang Core ML Infra team: Eric Shang, Pihui Wei ML Data team: Connor Votroubek, Yi He User Understanding team: Camilo Munoz, Simin Li If you work on ranking, retrieval, or recommendation systems, you’ve probably asked for some version of the same thing: “Give me the last N…

machine-learningrecommendation-systemengineeringdata-infrastructurepinterest

11 May

Lydia Cho 5 min read

One might think computer vision models are supposed to be easy to put into production. There are whole companies built on that promise: label a few images, click train, click deploy, done. In practice, it’s messier. Most of us working with these models aren’t ML experts, and moving fast to keep up with the industry […] The post Lessons from…

ai for developersartificial intelligencemachine learning

4 May

Netflix Technology Blog 15 min read

Saish Sali , Nipun Kumar , Sura Elamurugu Introduction As Netflix has grown, machine learning continues to support our ability to deliver value to members and drive excellence across multiple areas of our business. When Netflix began investing in machine learning over a decade ago, it was primarily focused on a single domain: personalization. Scala was the industry standard, our…

mlopsevent-driven-architecturemachine-learningdistributed-systemsknowledge-graph

1 May

Netflix Technology Blog 13 min read

By Nipun Kumar , Rajat Shah , Peter Chng Introduction This is the first blog post in a multi-part series that shares technical insights into how our ML model serving infrastructure powers several personalized experiences at scale across various domains (e.g., title recommendations, commerce). In this introductory blog post, we will dive into our domain-independent API abstraction and its traffic…

ai-platformdistributed-systemsinfrastructuremachine-learning

Pinterest Engineering 16 min read

Guangtong Bai | Staff Software Engineer, Product ML Infrastructure*; Shantam Shorewala | Software Engineer II, Product ML Infrastructure*; Chi Zhang | Staff Software Engineer, AI Platform*; Neha Upadhyay | Software Engineer II, AI Platform*; Haoyang Li | Director, Product ML Infrastructure *These authors contributed equally to this article. Background At Pinterest, our online ML serving systems employ a root-leaf architecture.…

engineeringpinterestmachine-learninginfrastructureefficiency

27 Apr

Pinterest Engineering 7 min read

Authors: Richard Huang | Machine Learning Engineer II; Yu Liu | Senior Machine Learning Engineer; Ziwei Guo | Senior Machine Learning Engineer; Andy Mao | Staff Machine Learning Engineer; Supeng Ge | Sr. Staff Machine Learning Engineer Introduction At Pinterest, conversion ads are crucial for matching users with products they are likely to purchase, boosting value for both users and…

recommendation-systempinterestmonetizationmachine-learningengineering

15 Apr

Pinterest Engineering 14 min read

Vaibhav Shankar; Staff Software Engineer | Raymond Lee; Staff Software Engineer | Chia-Wei Chen; Staff Software Engineer | Shunyao Li; Sr. Software Engineer | Yi Li; Staff Software Engineer | Ambud Sharma; Principal Engineer | Saurabh Vishwas Joshi; Principal Engineer | Charles-A. Francisco; Senior Engineer | Karthik Anantha Padmanabhan; Director, Engineering | David Westbrook; Sr. Manager, Engineering One day in…

performancekubernetespinterestmachine-learningengineering

13 Apr

Pinterest Engineering 8 min read

Authors: Matt Lawhon | Sr. Machine Learning Engineer; Filip Ryzner | Machine Learning Engineer II; Kousik Rajesh | Machine Learning Engineer II; Chen Yang | Sr. Staff Machine Learning Engineer; Saurabh Vishwas Joshi | Principal Engineer At Pinterest, scaling our recommendation models delivers outsized impact on the quality of the content we serve to users. Our Foundation Model (oral spotlight,…

pinterestmachine-learninginfrastructureengineeringrecommendation-system

26 Feb

12 Feb

Kazuaki Okumura,Mike White,Kevin Altschuler,Facundo Agriel,Ishan Mishra,Eric Wang,Dmitriy Meyerzon,Dmitriy Meyerzon,Hicham Badri,Appu Shaji 12 min read

Making products like Dropbox Dash accessible to individuals and businesses means tackling new challenges around efficiency and resource use.

modelsquantizationaimachine learningdash

6 Jan

Manisha Sudhir 6 min read

Expedia Group Technology — Data Science Empowering developers with seamless vector embedding solutions Photo by Daniela Cuevas on Unsplash Introduction Rapid advances in Machine Learning (ML), especially Generative AI, have increased the need for specialized capabilities like vector embedding similarity search. Vector embeddings are the numerical representations created by machine learning models which allow disparate inputs to be compared against…

machine-learningvector-databasemlsdata-science

18 Dec 2025

26 Nov 2025

Sujit Singh 7 min read

Introduction In an age where artificial intelligence (AI) and machine learning (ML) are integral to almost every aspect of our lives, ensuring the effectiveness, fairness, and reliability of ML models is paramount. Observability plays a crucial role in maintaining the performance of these models, allowing us to detect and resolve issues promptly. At Helpshift, we recognized the need for robust…

analyticsartificial-intelligencemachine-learningobservability

25 Nov 2025

Jean Alves 13 min read

By Jean V. Alves and Ferran Pla Fernández Moving beyond binary classification provides novel insights. In the real world, scams rarely present themselves in black and white. Fraudsters exploit nuance, impersonate legitimate brands, and mask malicious intent with seemingly ordinary behavior. That’s why Feedzai has launched ScamAlert (patent pending), a Generative AI-based system innovating on the current paradigm of scam…

large-language-modelsfinancial-fraudfraud-preventioncomputer-visionmachine-learning

10 Oct 2025

James Chan 6 min read

As a fast-growing home services platform, we heavily utilize machine learning to elevate user experience and improve business processes such as reducing spam, improving search results, and providing recommendations. In recent years, Generative AI has taken the world by storm as a powerful addition to traditional ML. We embraced this mega trend by incorporating LLMs into various areas of our…

data-sciencedatabricksgenaiinformation-securitymachine-learning

26 Aug 2025

Raphael Montaud 7 min read

How we made our email story recommendations better In this Part 1, you’ll understand how we improved one of the main ways our users are exposed to our product and how that led to a massive 7% increase on the average reading time for the digest users. Intro : This is a 4-part series breaking down improvements to the algorithm…

machine-learningrecommendation-systemsoftware-engineering

25 Aug 2025

Raphael Montaud 6 min read

Cross-Digest diversification In this part 4, we’ll see how we went from investigating a few complaints from digest power users to improving our digest recommendations across the board. Intro : This is a 4-part series breaking down improvements to the algorithm behind the Medium’s Daily Digest over the past year. When we started this work, the Digest was suboptimal —…

programmingrecommendation-systemsoftware-engineeringdatabasemachine-learning

Raphael Montaud 10 min read

Hard vs Soft Filtering and how this applies to Medium’s Recommendation System In this part 3 we’ll see how we modified one of our hard filtering rules and attempted to turn it into a machine learning based “soft filter”. Intro : This is a 4-part series breaking down improvements to the algorithm behind the Medium’s Daily Digest over the past…

software-developmentrecommendation-systemsoftware-engineeringmachine-learning

25 Jul 2025

Sofia Guerreiro 8 min read

By Sofia Guerreiro, Ricardo Ribeiro Pereira, Iker Perez, Jacopo Bono Detecting financial fraud is like finding a moving needle in a shifting haystack . Fraud accounts for a tiny fraction of financial transactions, often less than 0.1%. At the same time, fraudsters are constantly adapting their tactics to evade detection. And this happens within a live and dynamic environment, where…

machine-learningfraud-detectionresearchnetwork-intelligencefeedzai

24 Feb 2025

28 Jan 2025

16 Dec 2024

Zhengyu Shen 12 min read

Overview The past few months have been exciting times for Slack’s CI infrastructure. After years of developer frustration with Jenkins (everything from security issues to downtime to generally poor UX) internal pressure led us to move a majority of Slack’s CI jobs from Jenkins to GitHub Actions. My intern project at Slack this summer involved…

uncategorizedci-cddevopsdevtoolsmachine-learning

25 Nov 2024

8 Nov 2024

Srivani Bethi 7 min read

Background and motivation In the fast-paced world of software development, having the right tools can make all the difference. At Slack, we’ve been working on a set of AI-powered developer tools that are saving 10,000+ hours of developer time yearly, while meeting our strictest requirements for security, data protection, and compliance. In this post, we’ll…

uncategorizeddevtoolsmachine-learningsearch

31 Oct 2024

12 Aug 2024

Sérgio Jesus 9 min read

By Sérgio Jesus, Inês Silva, Pedro Saleiro, Hugo Ferreira, Pedro Bizarro In this blog post we will visit Aequitas Flow , an Open-Source framework designed to run complete and standardized experiments of Fair ML algorithms. We encourage you to try Aequitas Flow with the Google Colab Notebooks, which are available in the project’s GitHub repository . This blog post is…

responsible-aifairnessopen-sourceresearchmachine-learning

21 Jun 2024

Javier Liébana 13 min read

In the world of financial services, the bank or financial institution’s relationship with the customer relies on digital trust , which is anchored in two fundamental principles. First, it must ensure the person engaging through digital banking channels is genuinely the individual they claim to be. Second, it must confirm that this person is authorized to complete the intended financial…

feedzaidigital-trustonline-fraud-preventionmachine-learningresearch

18 Apr 2024

Kelly Moran 6 min read

At Slack, we’ve long been conservative technologists. In other words, when we invest in leveraging a new category of infrastructure, we do it rigorously. We’ve done this since we debuted machine learning-powered features in 2016, and we’ve developed a robust process and skilled team in the space. Despite that, over the past year we’ve been…

uncategorizedawsengineeringinfrastructuremachine-learning

21 Feb 2024

Ilay Chen 12 min read

By Ilay Chen and Tomer Akirav At PayPal, hundreds of thousands of Apache Spark jobs run on an hourly basis, processing petabytes of data and requiring a high volume of resources. To handle the growth of machine learning solutions, PayPal requires scalable environments, cost awareness and constant innovation. This blog explains how Apache Spark 3 and GPUs can help enterprises…

cloud-computinggpubig-datamachine-learningapache-spark

11 Dec 2023

Marina Lyan 4 min read

Photo by fabio on Unsplash PayPal supports over 400 million active consumers and merchants worldwide. Every minute there are several thousand payment transactions. To prevent fraud in real-time at such a scale, we need to streamline our ML workflow and feature engineering processes to build strong predictors of behaviors and risk indicators. On top of that, it must be done…

engineeringdeclarative-programmingfeature-engineeringpaypalmachine-learning

28 Nov 2023

Roland Meertens 8 min read

A clustering-based approach to create deep learning datasets in a day Introduction Understanding what’s happening in an image is both an important task, as well as a costly one. In the last few years, the field of computer vision has greatly accelerated due to the advances in neural networks. At Bumble Inc., we see potential value in computer vision for…

data-sciencemachine-learningclusteringdeep-learningdataset

31 Aug 2023

25 Apr 2023

Eric Elliott 16 min read

Why Every Developer Should Learn ChatGPT and SudoLang I recently started using an AI Driven Development (AIDD) process that has many benefits: Increased development productivity 10x — 20x , allowing us to take on more projects, and more ambitious challenges that would previously have been too resource-intensive to tackle. Opened up our applications to magical features we could not have…

chatgptartificial-intelligencesoftware-developmentmachine-learningai

3 Apr 2023

Eric Elliott 9 min read

Running Riteway’s usage example tests in SudoLang running on ChatGPT using GPT-4 I have been a long-time advocate of Test-Driven Development (TDD) because of its many productivity and quality benefits. You can read more about those in “TDD Changed My Life” . When I realized that GPT-4 was capable of following complex instructions, one of the first things I thought…

aimachine-learningtechnologyjavascriptchatgpt

6 Sept 2022

Katrina Ni 10 min read

Slack, as a product, presents many opportunities for recommendation, where we can make suggestions to simplify the user experience and make it more delightful. Each one seems like a terrific use case for machine learning, but it isn’t realistic for us to create a bespoke solution for each. Instead, we developed a unified framework we…

uncategorizedinfrastructuremachine-learning

20 Jul 2022

Dr. Romain Quéré 6 min read

Co-authored by Oumaima BENBAHAKKA , Dr. Paul Farrow , Dr. Romain Quéré and Dr. Yana Volkovich GitHub repository “The yellow pages of the internet” — Credits: Yana Volkovich At Xandr, we are actively participating in ongoing industry discussions about the future of identity, as well as carefully evaluating various emerging proposals. One of those proposals is the Topics API. Deconstructing…

machine-learningprivacy-sandboxcontent-classificationfuture-of-identity

24 Mar 2022

Driven by Code 6 min read

By: Samad Patel This blog post delves into how we answered a challenging business question using pre-trained AWS Models. Our question required us to parse text from photos, then analyze the contents of that text. We used AWS Rekognition and Comprehend to extract and classify text from photos, followed by a few highly interpretable statistical methods to analyze the data.…

cloud-computingimage-processingmachine-learninglinear-regression

25 Oct 2021

Steve Dower 1 min read

Our friends at Anaconda have posted a joint announcement last week regarding the use of their repository from Microsoft cloud-hosted products. See the full announcement on their website. Today, Anaconda, Inc. announced a collaboration with Microsoft to enable customers to confidently access Anaconda’s curated library of open-source packages within Microsoft Cloud-hosted products and services, including […] The post Anaconda licensing…

azurepythonanacondadata sciencemachine learning

9 Jul 2020

Andrew Halberstadt 12 min read

A browser is an enormously complex piece of software, and it's always in development. About a year ago, we asked ourselves: how could we do better? Our CI relied heavily on human intervention. What if we could instead correlate patches to tests using historical regression data? Could we use a machine learning algorithm to figure out the optimal set of…

artificial intelligencefeatured articlefirefox development highlightscimachine learning

26 Jun 2016

Dominic Steinitz 13 min read

Introduction In the 1920s, Lotka (1909) and Volterra (1926) developed a model of a very simple predator-prey ecosystem. Although simple, it turns out that the Canadian lynx and showshoe hare are well represented by such a model. Furthermore, the Hudson Bay Company kept records of how many pelts of each species were trapped for almost … Continue reading Ecology, Dynamical…

bayesianhaskellmachine learningnumerical methodsprobability

1 May 2016

Dominic Steinitz 5 min read

Introduction This is a bit different from my usual posts (well apart from my write up of hacking at Odessa) in that it is a log of how I managed to get LibBi (Library for Bayesian Inference) to run on my MacBook and then not totally satisfactorily (as you will see if you read on). … Continue reading Fun with…

bayesianmachine learningstatistics

6 Dec 2015

Dominic Steinitz 15 min read

Introduction Let be a (hidden) Markov process. By hidden, we mean that we are not able to observe it. And let be an observable Markov process such that That is the observations are conditionally independent given the state of the hidden process. As an example let us take the one given in Särkkä (2013) where … Continue reading Naive Particle…

bayesianhaskellmachine learningprobabilitystatistics

7 Jun 2015

Dominic Steinitz 1 min read

Here’s the same analysis of estimating population growth using Stan. data { int<lower=0> N; // number of observations vector[N] y; // observed population } parameters { real r; } model { real k; real p0; real deltaT; real sigma; real mu0; real sigma0; vector[N] p; k <- 1.0; p0 <- 0.1; deltaT <- 0.0005; sigma … Continue reading Population Growth…

bayesianmachine learningstatistics

31 May 2015

9 Sept 2014

Dominic Steinitz 7 min read

Summary An extended Kalman filter in Haskell using type level literals and automatic differentiation to provide some guarantees of correctness. Population Growth Suppose we wish to model population growth of bees via the logistic equation We assume the growth rate is unknown and drawn from a normal distribution but the carrying capacity is known and … Continue reading Fun with…

bayesianhaskellmachine learningprobabilitystatistics

13 Oct 2013

Dominic Steinitz 12 min read

Preface The intended audience of this article is someone who knows something about Machine Learning and Artifical Neural Networks (ANNs) in particular and who recalls that fitting an ANN required a technique called backpropagation. The goal of this post is to refresh the reader’s knowledge of ANNs and backpropagation and to show that the latter … Continue reading Backpropogation is…

haskellmachine learningnumerical methodsstatistics

31 May 2013

Dominic Steinitz 18 min read

Introduction Neural networks are a method for classifying data based on a theory of how biological systems operate. They can also be viewed as a generalization of logistic regression. A method for determining the coefficients of a given model, backpropagation, was developed in the 1970’s and rediscovered in the 1980’s. The article “A Functional Approach … Continue reading Neural Networks…

haskellmachine learning

30 Apr 2013

Dominic Steinitz 7 min read

Introduction Having shown how to use automated differentiation to estimate parameters in the case of linear regression let us now turn our attention to the problem of classification. For example, we might have some data about people’s social networking such as volume of twitter interactions and number of twitter followers together with a label which … Continue reading Logistic Regression…

haskellmachine learningprobabilitystatistics

26 Apr 2013

Dominic Steinitz 6 min read

Introduction Automated differentiation was developed in the 1960’s but even now does not seem to be that widely used. Even experienced and knowledgeable practitioners often assume it is either a finite difference method or symbolic computation when it is neither. This article gives a very simple application of it in a machine learning / statistics … Continue reading Regression and…

haskellmachine learningprobabilitystatisticsuncategorized