How Airbnb used sequential geographic recovery signals and prior propagation to generate reliable corridor-level forecasts when local data was scarce. By: Harrison Katz The problem with unprecedented shocks Almost every forecasting system is built on the same implicit assumption: the future will resemble the past. You train on historical data, you validate on holdout periods, and you trust that past…
#machine learning
52 posts
2 Jun
Author: Paul Coursaux At Criteo, retail media is about helping brands reach shoppers directly on retailers’ property, right at the digital shelf where purchase decisions are made. Through CMAX , our unified retail media platform, we connect advertisers to retailers’ audiences with Sponsored Products that appear alongside native results in onsite search and browsing experiences. Sponsored products: Boost your brand…
21 May
Authors ( listed alphabetically ) Ads Feature Engineering Infra team: Ajay Venkatakrishnan, Le Zhang Core ML Infra team: Eric Shang, Pihui Wei ML Data team: Connor Votroubek, Yi He User Understanding team: Camilo Munoz, Simin Li If you work on ranking, retrieval, or recommendation systems, you’ve probably asked for some version of the same thing: “Give me the last N…
Nova lets engineers run multiple coding sessions in parallel and lets internal systems use AI agents as part of automated workflows.
11 May
One might think computer vision models are supposed to be easy to put into production. There are whole companies built on that promise: label a few images, click train, click deploy, done. In practice, it’s messier. Most of us working with these models aren’t ML experts, and moving fast to keep up with the industry […] The post Lessons from…
4 May
Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph
Netflix Technology BlogSaish Sali , Nipun Kumar , Sura Elamurugu Introduction As Netflix has grown, machine learning continues to support our ability to deliver value to members and drive excellence across multiple areas of our business. When Netflix began investing in machine learning over a decade ago, it was primarily focused on a single domain: personalization. Scala was the industry standard, our…
1 May
By Nipun Kumar , Rajat Shah , Peter Chng Introduction This is the first blog post in a multi-part series that shares technical insights into how our ML model serving infrastructure powers several personalized experiences at scale across various domains (e.g., title recommendations, commerce). In this introductory blog post, we will dive into our domain-independent API abstraction and its traffic…
Guangtong Bai | Staff Software Engineer, Product ML Infrastructure*; Shantam Shorewala | Software Engineer II, Product ML Infrastructure*; Chi Zhang | Staff Software Engineer, AI Platform*; Neha Upadhyay | Software Engineer II, AI Platform*; Haoyang Li | Director, Product ML Infrastructure *These authors contributed equally to this article. Background At Pinterest, our online ML serving systems employ a root-leaf architecture.…
27 Apr
From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest
PinterestAuthors: Richard Huang | Machine Learning Engineer II; Yu Liu | Senior Machine Learning Engineer; Ziwei Guo | Senior Machine Learning Engineer; Andy Mao | Staff Machine Learning Engineer; Supeng Ge | Sr. Staff Machine Learning Engineer Introduction At Pinterest, conversion ads are crucial for matching users with products they are likely to purchase, boosting value for both users and…
15 Apr
Vaibhav Shankar; Staff Software Engineer | Raymond Lee; Staff Software Engineer | Chia-Wei Chen; Staff Software Engineer | Shunyao Li; Sr. Software Engineer | Yi Li; Staff Software Engineer | Ambud Sharma; Principal Engineer | Saurabh Vishwas Joshi; Principal Engineer | Charles-A. Francisco; Senior Engineer | Karthik Anantha Padmanabhan; Director, Engineering | David Westbrook; Sr. Manager, Engineering One day in…
13 Apr
Authors: Matt Lawhon | Sr. Machine Learning Engineer; Filip Ryzner | Machine Learning Engineer II; Kousik Rajesh | Machine Learning Engineer II; Chen Yang | Sr. Staff Machine Learning Engineer; Saurabh Vishwas Joshi | Principal Engineer At Pinterest, scaling our recommendation models delivers outsized impact on the quality of the content we serve to users. Our Foundation Model (oral spotlight,…
26 Feb
How we train Dash's search ranking models with a mix of human and LLM-assisted labeling.
12 Feb
Making products like Dropbox Dash accessible to individuals and businesses means tackling new challenges around efficiency and resource use.
6 Jan
Expedia Group Technology — Data Science Empowering developers with seamless vector embedding solutions Photo by Daniela Cuevas on Unsplash Introduction Rapid advances in Machine Learning (ML), especially Generative AI, have increased the need for specialized capabilities like vector embedding similarity search. Vector embeddings are the numerical representations created by machine learning models which allow disparate inputs to be compared against…
18 Dec 2025
The feature store is a critical part of how we rank and retrieve the right context across your work.
26 Nov 2025
Introduction In an age where artificial intelligence (AI) and machine learning (ML) are integral to almost every aspect of our lives, ensuring the effectiveness, fairness, and reliability of ML models is paramount. Observability plays a crucial role in maintaining the performance of these models, allowing us to detect and resolve issues promptly. At Helpshift, we recognized the need for robust…
25 Nov 2025
By Jean V. Alves and Ferran Pla Fernández Moving beyond binary classification provides novel insights. In the real world, scams rarely present themselves in black and white. Fraudsters exploit nuance, impersonate legitimate brands, and mask malicious intent with seemingly ordinary behavior. That’s why Feedzai has launched ScamAlert (patent pending), a Generative AI-based system innovating on the current paradigm of scam…
10 Oct 2025
As a fast-growing home services platform, we heavily utilize machine learning to elevate user experience and improve business processes such as reducing spam, improving search results, and providing recommendations. In recent years, Generative AI has taken the world by storm as a powerful addition to traditional ML. We embraced this mega trend by incorporating LLMs into various areas of our…
26 Aug 2025
How we made our email story recommendations better In this Part 1, you’ll understand how we improved one of the main ways our users are exposed to our product and how that led to a massive 7% increase on the average reading time for the digest users. Intro : This is a 4-part series breaking down improvements to the algorithm…
25 Aug 2025
Cross-Digest diversification In this part 4, we’ll see how we went from investigating a few complaints from digest power users to improving our digest recommendations across the board. Intro : This is a 4-part series breaking down improvements to the algorithm behind the Medium’s Daily Digest over the past year. When we started this work, the Digest was suboptimal —…
Hard vs Soft Filtering and how this applies to Medium’s Recommendation System In this part 3 we’ll see how we modified one of our hard filtering rules and attempted to turn it into a machine learning based “soft filter”. Intro : This is a 4-part series breaking down improvements to the algorithm behind the Medium’s Daily Digest over the past…
25 Jul 2025
By Sofia Guerreiro, Ricardo Ribeiro Pereira, Iker Perez, Jacopo Bono Detecting financial fraud is like finding a moving needle in a shifting haystack . Fraud accounts for a tiny fraction of financial transactions, often less than 0.1%. At the same time, fraudsters are constantly adapting their tactics to evade detection. And this happens within a live and dynamic environment, where…
24 Feb 2025
How we used generative AI to build our year-in-review campaign
28 Jan 2025
Qualitative comparison of image embedding models to power a scalable similar-image replacement system for Canva designs.
16 Dec 2024
Overview The past few months have been exciting times for Slack’s CI infrastructure. After years of developer frustration with Jenkins (everything from security issues to downtime to generally poor UX) internal pressure led us to move a majority of Slack’s CI jobs from Jenkins to GitHub Actions. My intern project at Slack this summer involved…
25 Nov 2024
How we improved Canva’s private design search while respecting the privacy of our community.
8 Nov 2024
Background and motivation In the fast-paced world of software development, having the right tools can make all the difference. At Slack, we’ve been working on a set of AI-powered developer tools that are saving 10,000+ hours of developer time yearly, while meeting our strictest requirements for security, data protection, and compliance. In this post, we’ll…
31 Oct 2024
Here's how machine learning drives business efficiency, from customer insights to fraud detection, powering smarter, faster decisions. The post Machine Learning for business: what are the advantages? appeared first on Erlang Solutions.
12 Aug 2024
By Sérgio Jesus, Inês Silva, Pedro Saleiro, Hugo Ferreira, Pedro Bizarro In this blog post we will visit Aequitas Flow , an Open-Source framework designed to run complete and standardized experiments of Fair ML algorithms. We encourage you to try Aequitas Flow with the Google Colab Notebooks, which are available in the project’s GitHub repository . This blog post is…
21 Jun 2024
In the world of financial services, the bank or financial institution’s relationship with the customer relies on digital trust , which is anchored in two fundamental principles. First, it must ensure the person engaging through digital banking channels is genuinely the individual they claim to be. Second, it must confirm that this person is authorized to complete the intended financial…
18 Apr 2024
At Slack, we’ve long been conservative technologists. In other words, when we invest in leveraging a new category of infrastructure, we do it rigorously. We’ve done this since we debuted machine learning-powered features in 2016, and we’ve developed a robust process and skilled team in the space. Despite that, over the past year we’ve been…
21 Feb 2024
Leveraging Spark 3 and NVIDIA’s GPUs to Reduce Cloud Cost by up to 70% for Big Data Pipelines
PaypalBy Ilay Chen and Tomer Akirav At PayPal, hundreds of thousands of Apache Spark jobs run on an hourly basis, processing petabytes of data and requiring a high volume of resources. To handle the growth of machine learning solutions, PayPal requires scalable environments, cost awareness and constant innovation. This blog explains how Apache Spark 3 and GPUs can help enterprises…
11 Dec 2023
Photo by fabio on Unsplash PayPal supports over 400 million active consumers and merchants worldwide. Every minute there are several thousand payment transactions. To prevent fraud in real-time at such a scale, we need to streamline our ML workflow and feature engineering processes to build strong predictors of behaviors and risk indicators. On top of that, it must be done…
28 Nov 2023
A clustering-based approach to create deep learning datasets in a day Introduction Understanding what’s happening in an image is both an important task, as well as a costly one. In the last few years, the field of computer vision has greatly accelerated due to the advances in neural networks. At Bumble Inc., we see potential value in computer vision for…
31 Aug 2023
The effective use of AI is becoming the next great differentiator for business, but many SMEs are confused about what to adopt and how to adopt it. The post What businesses should consider when adopting AI and machine learning appeared first on Erlang Solutions.
25 Apr 2023
Why Every Developer Should Learn ChatGPT and SudoLang I recently started using an AI Driven Development (AIDD) process that has many benefits: Increased development productivity 10x — 20x , allowing us to take on more projects, and more ambitious challenges that would previously have been too resource-intensive to tackle. Opened up our applications to magical features we could not have…
3 Apr 2023
Running Riteway’s usage example tests in SudoLang running on ChatGPT using GPT-4 I have been a long-time advocate of Test-Driven Development (TDD) because of its many productivity and quality benefits. You can read more about those in “TDD Changed My Life” . When I realized that GPT-4 was capable of following complex instructions, one of the first things I thought…
6 Sept 2022
Slack, as a product, presents many opportunities for recommendation, where we can make suggestions to simplify the user experience and make it more delightful. Each one seems like a terrific use case for machine learning, but it isn’t realistic for us to create a bespoke solution for each. Instead, we developed a unified framework we…
20 Jul 2022
Co-authored by Oumaima BENBAHAKKA , Dr. Paul Farrow , Dr. Romain Quéré and Dr. Yana Volkovich GitHub repository “The yellow pages of the internet” — Credits: Yana Volkovich At Xandr, we are actively participating in ongoing industry discussions about the future of identity, as well as carefully evaluating various emerging proposals. One of those proposals is the Topics API. Deconstructing…
24 Mar 2022
Using Machine Learning to Understand How Branding in Photos Affects the Car Shopping Experience
TrueCarBy: Samad Patel This blog post delves into how we answered a challenging business question using pre-trained AWS Models. Our question required us to parse text from photos, then analyze the contents of that text. We used AWS Rekognition and Comprehend to extract and classify text from photos, followed by a few highly interpretable statistical methods to analyze the data.…
25 Oct 2021
Our friends at Anaconda have posted a joint announcement last week regarding the use of their repository from Microsoft cloud-hosted products. See the full announcement on their website. Today, Anaconda, Inc. announced a collaboration with Microsoft to enable customers to confidently access Anaconda’s curated library of open-source packages within Microsoft Cloud-hosted products and services, including […] The post Anaconda licensing…
9 Jul 2020
A browser is an enormously complex piece of software, and it's always in development. About a year ago, we asked ourselves: how could we do better? Our CI relied heavily on human intervention. What if we could instead correlate patches to tests using historical regression data? Could we use a machine learning algorithm to figure out the optimal set of…
26 Jun 2016
Introduction In the 1920s, Lotka (1909) and Volterra (1926) developed a model of a very simple predator-prey ecosystem. Although simple, it turns out that the Canadian lynx and showshoe hare are well represented by such a model. Furthermore, the Hudson Bay Company kept records of how many pelts of each species were trapped for almost … Continue reading Ecology, Dynamical…
1 May 2016
Introduction This is a bit different from my usual posts (well apart from my write up of hacking at Odessa) in that it is a log of how I managed to get LibBi (Library for Bayesian Inference) to run on my MacBook and then not totally satisfactorily (as you will see if you read on). … Continue reading Fun with…
6 Dec 2015
Introduction Let be a (hidden) Markov process. By hidden, we mean that we are not able to observe it. And let be an observable Markov process such that That is the observations are conditionally independent given the state of the hidden process. As an example let us take the one given in Särkkä (2013) where … Continue reading Naive Particle…
7 Jun 2015
Here’s the same analysis of estimating population growth using Stan. data { int<lower=0> N; // number of observations vector[N] y; // observed population } parameters { real r; } model { real k; real p0; real deltaT; real sigma; real mu0; real sigma0; vector[N] p; k <- 1.0; p0 <- 0.1; deltaT <- 0.0005; sigma … Continue reading Population Growth…
31 May 2015
Thames Flux It is roughly 150 miles from the source of the Thames to Kingston Bridge. If we assume that it flows at about 2 miles per hour then the water at Thames Head will have reached Kingston very roughly at days. The Environmental Agency measure the flux at Kingston Bridge on a twice daily … Continue reading The Flow…
9 Sept 2014
Summary An extended Kalman filter in Haskell using type level literals and automatic differentiation to provide some guarantees of correctness. Population Growth Suppose we wish to model population growth of bees via the logistic equation We assume the growth rate is unknown and drawn from a normal distribution but the carrying capacity is known and … Continue reading Fun with…
13 Oct 2013
Preface The intended audience of this article is someone who knows something about Machine Learning and Artifical Neural Networks (ANNs) in particular and who recalls that fitting an ANN required a technique called backpropagation. The goal of this post is to refresh the reader’s knowledge of ANNs and backpropagation and to show that the latter … Continue reading Backpropogation is…
31 May 2013
Introduction Neural networks are a method for classifying data based on a theory of how biological systems operate. They can also be viewed as a generalization of logistic regression. A method for determining the coefficients of a given model, backpropagation, was developed in the 1970’s and rediscovered in the 1980’s. The article “A Functional Approach … Continue reading Neural Networks…
30 Apr 2013
Introduction Having shown how to use automated differentiation to estimate parameters in the case of linear regression let us now turn our attention to the problem of classification. For example, we might have some data about people’s social networking such as volume of twitter interactions and number of twitter followers together with a label which … Continue reading Logistic Regression…
26 Apr 2013
Introduction Automated differentiation was developed in the 1960’s but even now does not seem to be that widely used. Even experienced and knowledgeable practitioners often assume it is either a finite difference method or symbolic computation when it is neither. This article gives a very simple application of it in a machine learning / statistics … Continue reading Regression and…