How Airbnb used sequential geographic recovery signals and prior propagation to generate reliable corridor-level forecasts when local data was scarce. By: Harrison Katz The problem with unprecedented shocks Almost every forecasting system is built on the same implicit assumption: the future will resemble the past. You train on historical data, you validate on holdout periods, and you trust that past…
#data science
20 posts
2 Jun
18 May
TL;DR LLM evals, automated judges that assess relevance, coherence, and quality at scale, are a powerful new... The post Better Experiments with LLM Evals — A funnel, not a fork appeared first on Spotify Engineering.
29 Apr
Evaluating AI at Scale: How Thumbtack Approaches Reliability, Safety, and Quality in GenAI
ThumbtackA practical look at how Thumbtack navigates evaluation for emerging AI experiences and what we’ve learned along the way. By: Shishir Dash , Director of Applied Science & Teja Venkat Kolli , Senior Applied Scientist Evaluating AI at Scale Introduction AI is reshaping how people interact with products, and Thumbtack is no exception. We’re introducing AI into more aspects of…
24 Mar
Expedia Group Technology — Data Workload‑aware routing for Trino Photo by Joseph Barrientos on Unsplash Trino — a fork of PrestoSQL — is a powerful tool in modern data analytics, enabling organizations to query large datasets quickly and efficiently. As a distributed SQL query engine, Trino provides fast, scalable insights without requiring data relocation. While Trino is robust on its…
17 Feb
Expedia Group Technology — Data Quickly identifying winning ranking models before committing to A/B tests Authors: Adam Woznica, Benjamin Stieger, and Stefania Ebli Photo by Il Vagabiondo on Unsplash Expedia Group ™ covers a portfolio of brands such as Expedia.com, Hotels.com, and Vrbo, that power lodging searches for millions of travel shoppers every day. In this competitive market matching users…
27 Jan
Expedia Group Technology — Data Two roles one goal — understanding users better By Sophie Rabet and Alyssa White Photo by Samsung Memory US on Unsplash Quantitative User Experience (UX) Research, as a discipline, is growing rapidly. Quant UX Con 2022, the first ever general industry conference for the discipline, was organized with the expectation of about 200 attendees. After…
6 Jan
Expedia Group Technology — Data Science Empowering developers with seamless vector embedding solutions Photo by Daniela Cuevas on Unsplash Introduction Rapid advances in Machine Learning (ML), especially Generative AI, have increased the need for specialized capabilities like vector embedding similarity search. Vector embeddings are the numerical representations created by machine learning models which allow disparate inputs to be compared against…
10 Oct 2025
As a fast-growing home services platform, we heavily utilize machine learning to elevate user experience and improve business processes such as reducing spam, improving search results, and providing recommendations. In recent years, Generative AI has taken the world by storm as a powerful addition to traditional ML. We embraced this mega trend by incorporating LLMs into various areas of our…
19 Aug 2025
An introduction to the new Results Table integrated into the output cell of Notebooks, powered by the VS Code extension called Data Wrangler. The post Announcing the Data Wrangler powered Notebook Results Table appeared first on Microsoft for Python Developers Blog.
17 Mar 2025
Data scientists use different Jupyter notebooks every day — ranging from disposable ones for quick tasks to those shareable with clients. Over time, more and more notebooks accumulate, making it increasingly difficult to reuse them in whole or in part. To mitigate this problem and make the most relevant pieces of code quickly accessible to every data scientist, we developed…
2 Dec 2024
Experience cloud computing with Python on Azure during Python Day 2024! The post Announcing: Azure Developers – Python Day appeared first on Microsoft for Python Developers Blog.
4 Nov 2024
AI did not write this blog post, but it will make your exploratory data analysis with Data Wrangler better! Today, we’re excited to introduce our first step of integrating the power of Copilot into Data Wrangler. With this first integration of Copilot with Data Wrangler, you’ll be able to: Use natural language to clean and […] The post Announcing GitHub…
7 May 2024
Announcing Data Wrangler: Code-centric viewing and cleaning of tabular data in Visual Studio Code
Microsoft Python EngineeringToday, we are excited to announce the general availability of the Data Wrangler extension for Visual Studio Code! Data Wrangler is a free extension that offers data viewing and cleaning that is directly integrated into VS Code and the Jupyter extension. It provides a rich user interface to view and analyze your data, show insightful […] The post Announcing Data…
28 Nov 2023
A clustering-based approach to create deep learning datasets in a day Introduction Understanding what’s happening in an image is both an important task, as well as a costly one. In the last few years, the field of computer vision has greatly accelerated due to the advances in neural networks. At Bumble Inc., we see potential value in computer vision for…
17 Mar 2023
Microsoft announces the launch of Data Wrangler, a data-centric user interface that generates Python code to help data scientists complete their data preparation tasks faster and with fewer errors. The post Introducing the Data Wrangler extension for Visual Studio Code appeared first on Microsoft for Python Developers Blog.
21 Sept 2022
How I learned to manipulate JSON data with Pandas on a Jupyter Notebook and deconstruct it to a DataFrame ready for queries. Image by author created from Jupiter photo by NASA and Pandas photo by Pascal Müller on Unsplash A bit of context first I started a self-study path to learn the theoretical fundamentals of Data Science and Machine Learning.…
21 Apr 2022
25 Oct 2021
Our friends at Anaconda have posted a joint announcement last week regarding the use of their repository from Microsoft cloud-hosted products. See the full announcement on their website. Today, Anaconda, Inc. announced a collaboration with Microsoft to enable customers to confidently access Anaconda’s curated library of open-source packages within Microsoft Cloud-hosted products and services, including […] The post Anaconda licensing…
8 Jul 2020
Enhance your Azure Machine Learning experience with the VS Code extension
Microsoft Python EngineeringThe VS Code team is excited to announce releases of the Azure Machine Learning extension which aims to help you manage your core machine learning assets from directly within your favourite editor! The post Enhance your Azure Machine Learning experience with the VS Code extension appeared first on Microsoft for Python Developers Blog.
16 Apr 2019
Pyodide is an experimental project from Mozilla to create a full Python data science stack that runs entirely in the browser. We think it’s worthwhile to work on moving the JavaScript data science ecosystem forward, and that's why we built and released Iodide earlier this year. In the meantime, we’re meeting data scientists where they are by bringing the popular…