Talks


Talks I gave at various opportunities

Text Embedding and Representation - Classical & Modern Methods @ Harmonya & DataCaoch, 2023-2024 - A deep overview of how the issue of text representation in machine learning has been tackled with throughout the history of the field. These are unified slides for a series of 4 lectures I gave on the topic, encompassing both classical methods such as word2vec and modern SOTA methods like Sentence Transformers and LLM-based embeddings - covering every notable development in between these points. [Google Slides]

A short intro to Data Vis @ Business Aspects of Digital Experiments, 05.03.2024 - A guest lecture at the Business Aspects of Digitial Experiments B.A. course at Tel Aviv University Coller School of Business Management. [Google Slides]

Text Embedding and Representation - Part 2 @ DataCoach, 18.12.2023 - A deep overview of how the issue of text representation in machine learning has been tackled with throughout the history of the field. Part 2 continued the first part by covering various early attempts to use deep learning to solve the problem of learning good text representations, and concluding with Sentence Transformers and LLM-based embeddings. [Google Slides]

Text Embedding and Representation - Part 1 @ DataCoach, 11.12.2023 - A deep overview of how the issue of text representation in machine learning has been tackled with throughout the history of the field. Part 1 covered everything from bag-of-words up to document-level variations on Word2Vec, up until deep learning technique became prominent in learning text embeddings. [Google Slides]

What is MLOps? @ Harmonya, 6.09.2023 - A concise review of what types and categories of MLOps toosl are out there. [Google Slides]

Frameworks for data science projects: Workflows, def docs, playbooks & peer reviews @ DataNights DS Management #2, 21.06.2023- A few frameworks to handle data science project challenges. Given at the second cohort of our DataNights program on leading data science teams. [Google Slides]

Interview Formats for Data Scientists @ DataNights DS Management #2, 31.05.2023 - Covers different templates for technical data science interviews, with an emphasis on their suitability to interviewing data scientists of different levels of seniority. Given at the second cohort of our DataNights program on leading data science teams. [Google Slides]

Deep Learning Approaches for Time Series Forecasting @ DataTalks #38, 29.3.2023 - An introductory talk covering neural network approaches to time series forecasting, including practical aspects and state-of-the-art techniques. [Google Slides] [Event page]

Text Embedding & Representation @ Harmonya, 25.01.2023 - An overview of classic methods for word and document embedding in NLP, and the basic concepts of text representation in machine learning. [Google Slides]

About the DataCoach Technion Program @ DataCoach Expose Event, The Technion, 18.01.2023 - A presentation of the DataCoach Technion program by the Datahack nonprofit. [Google Slides]

The Hard Case of Why: A Causal Inference Case Study of Data Science Projects in Startups @ Faculty Seminar, Industrial Engineering & Management, Azrieli College AI Product course, 22.11.2022 - A case study for data science projects as they are done in small startups, meant to provide students with insights about work in the data science industry. [Google Slides]

Intro to Data Science in Startups @ pmsphere’s AI Product course, 26.10.2022 - A technical and methodological introduction to data science work as it is done in startups, aimed at product management personnel being trained to work and lead the product work in data science projects. Given as part of the first cohort of the AI prod.: Secrets for PM (PM602) course by the awesome women of pmsphere. [Google Slides]

Data Science Peer Review @ DataNights DS Management, 14.09.2022 - Covers different templates for technical data science interviews, with an emphasis on their suitability to interviewing data scientists of different levels of seniority. Given at the first cohort of my DataNights program on leading data science teams. [Google Slides]

Interview Formats for Data Scientists @ Data Scicence Leads Israel, 13.09.2022 - A layout for a peer review process for the different phases of data sciece projects. Given at the monthly meeting of the Data Science Leads in Israel forum. [Google Slides]

Interview Formats for Data Scientists @ DataNights DS Management, 24.08.2022 - A layout for a peer review process for the different phases of data sciece projects. Given at the first cohort of my DataNights program on leading data science teams. [Google Slides]

KPI-Objective Alignment in Data Science Projects @ DataNights DS Management, 17.08.2022 - A discussion of the ways we align and misalign technical and mathematicl objectives of data science techniques to business and product KPIs. Given at the first cohort of my DataNights program on leading data science teams. [Google Slides]

Data Science Projects: Workflows and Playbooks @ DataNights DS Management, 10.08.2022 - A few frameworks to handle data science project challenges. Given at the first cohort of my DataNights program on leading data science teams. [Google Slides]

DeBERTa: An Introduction @ Kaggle IL Meetup #11, 26.07.2022 - A quick overview of the novelties the DeBERTa model introduced to the tranformer architecture. [Google Slides]

Feedback Prize - Predicting Effective Arguments: The Kickoff @ Kaggle IL Meetup #11, 26.07.2022 - Kickoff presentation for the Feedback Prize Kaggle competition, while it was running. [Google Slides]

Deep Learning Approaches for Time Series Forecasting @ Y-DATA, 3.6.2022 - A guest lecture at the Y-DATA program, covering neural network approaches to time series forecasting, including practical aspects and state-of-the-art techniques. [Google Slides]

Data Science Peer Review @ Voyantis, 25.4.2022 - A layout for a peer review process for the different phases of data sciece projects. Given as part of the data science tribe meeting at Voyantis. [Google Slides]

Introduction to PU Learning @ DiSCo is a data science community at Ben-Gurion University of the Negev, 10.4.2022 - An introduction to the semi-supervised classification scenario of Positive-Unlabeled learning. Covers theoretical definitions, performance metrics for the scenario and solution approaches. [Google Slides]

Data Science in the Wild: How it’s done (in startups) @ NAYA College, 7.4.2022 - A review of how data science work actually looks like in startups, and some usefull tools and methodologies to succeeed in it. [Google Slides]

KPI-Objective Alignment in Data Science Projects @ AI & Data Virtual Summit, 5.1.2021 - A discussion of the ways we align and misalign technical and mathematicl objectives of data science techniques to business and product KPIs. [Google Slides]

Introduction to Advanced Concepts in Neural Networks @ Forter, 18.11.2020 - A lecture meant to follow a general introduction to feed-forward neural networks, and thus delves intro more advanced concepts such as convolutional NNs, sequence models and graph NNs. [Google Slides]

Data Cleaning for Machine Learning @ DataNights #4, 2.8.2020 - A review of common data cleaning principles, techniques and methods. Given as part of the forth cohort of DataHack’s DataNights program. [Google Slides]

Introduction to PU Learning @ BigPanda, 11.5.2020 - An introduction to the semi-supervised classification scenario of Positive-Unlabeled learning. Covers theoretical definitions, performance metrics for the scenario and solution approaches. [Google Slides]

Introduction to Word2Vec @ DataNights #3, 10.5.2020 - An introduction to word embeddings in general, and the Word2Vec technique in particular. Given as part of the third cohort of DataHack’s DataNights program. [Video] [Google Slides] [Jupyter Notebooks and a home exercise]

Data Science Peer Review @ Outbrain, 23.4.2020 - A layout for a peer review process for the different phases of data sciece projects. Given as part of a data science guild day at Outbrain. [Google Slides]

Python Testing with pytest @ BigPanda, April 2020 - An introduction to testing Python code with pytest. [Google Slides]

Detecting Stationarity in Time Series Data @ BigPanda, 2.4.2020 - A short review of the different ways to detect stationarity in time series data. Based on my blog post of the same name. [Google Slides]

Data Science Peer Review @ BigPanda, 8.3.2020 - A layout for a peer review process for the different phases of data sciece projects. [Google Slides]

Introduction to Word2Vec @ DataNights #2, 5.2.2020 - An introduction to word embeddings in general, and the Word2Vec technique in particular. Given as part of the second cohort of DataHack’s DataNights program. [Google Slides] [Jupyter Notebooks and a home exercise]

Causal Inference in Time Series Data @ PyData #28, 2.01.20 - A quick overview of causal inference in time series data, base on my literature review-y blog post on the same topic, given in PyData Tel Aviv #28. [Event] [Google Slides] [Video]

How to run a Data Science project: An Overview for Managers @ SWC Consulting, 28.11.19 - An introduction to data science in action, written as an overview for managers, business people and product managers aiming to manage or work with data scientists. [Google Slides]

Unsupervised Document Embedding Techniques: A Status Report @ CodeteCON #KRK5, Krakow, Poland, 15.11.19 - A survey of modern techniques for unsupervised (and some supervised) document embedding, including takeways and a status report from the industry. [Google Slides]

ASHRAE Great Energy Predictor III: The Kickoff @ Kaggle IL Meetup #6, 29.10.19 - Kickoff presentation for the ASHRAE Great Energy Predictor III Kaggle competition, while it was running. [Google Slides]

Stationarity in Time Series Analysis, Part 2: Parametric notions of non-stationarity @ BigPanda, September 2019 - This talk discusses stochastic process modelling and parametric notions of stationarity. Based on the second part of my blog post on the same topic. [Google Slides]

Stationarity in Time Series Analysis: Why it’s important and how it’s defined @ BigPanda, September 2019 - A thorough introduction to the cocept of stationarity in time series analysis, including a brief introduction of stochastic processes. Does not include paramteric notions of stationarity. Based on the first part of my blog post on the same topic. [Google Slides]

Document Embeddings: A concise literature review @ BigPanda, September 2019 - A concise overview of document embedding techniques. Assumes familiarity with word embedding techniques. Based on my literature review on the same topic. [Google Slides]

Word Embeddings: A rough introduction @ BigPanda, September 2019 - A very quick and rough introduction to word embeddings, focusing on the skip-gram model of word2vec. [Google Slides] [Video]

DataLearn 2019: Introduction to Machine Learning @ DataLearn Prep Night 2019, 25.8.19 - A hands-on introductory workshop to machine learning, focusing on supervised classification, for the participants of the DataLearn track at Datahack 2019. [Video] [Github repository] [Google Slides]

Predicting Molecular Properties: Updates @ Kaggle IL Meetup #3, 22.7.19 - Some updates from the “Predicting Molecular Properties” Kaggle competition, while it was running. [Google Slides]

Clustering Evaluation @ Check Point data science guild day, 4.7.19 - An introductory talk about how to evaluate clustering methods . [Google Slides]

Clustering: A very rough intro @ BigPanda, April 2019 - An introductory talk about clustering I gave to data team members at BigPanda as part of periodic knowledge share sessions. [Google Slides]

Data Preparation: Scaling and Normalization (A very rough intro) @ BigPanda, March 2019 - An introductory talk about scaling and normalization I gave to data team members at BigPanda as part of periodic knowledge share sessions. [Google Slides]

Python Packaging Workshop @ Intuit, 13.12.2019 - A hands-on workshop for Python programmers, walking them through the process of writing, testing and publishing their own Python packages. [Resources page: Presentations slides, repository links and complementary blog post]

Packaging for personal open source projects @ PyData Tel Aviv #19 - Open-Source Sprint, 19.12.18 - I talked about why packaging can be important even for small and personal open source projects, focusing on Python projects, at the 19th PyData TLV meetup. [Google Slides] [PDF slides]

Bot use in election period and its effect on freedom of speech @ Conference on freedom of speech on the internet in election period, 6.12.18 - I was part of a panel discussing the topic of bot use in election period and its effect on freedom of speech, providing the point of view of the AI research community and industry. [Video]

Practical preprocessing for Machine Learning @ DataLearn 2018, 3.10.18 - An overview of the most common and important preprocessing techniques used for data preparation in Machine Learning projects, aimed at newcomers. Given as part of DataLearn 2018, the workshop track of DataHack 2018. [Video] [Google Slides] [PDF slides]

Pied PyPIer: Why packaging is important for both open and close data science projects @ PyCon Israel 2018, 5.6.18 - A talk aimed at sharing my experience developing small Python packages in order to encourage more data scientists to open and share their Python code in package form. [Video] [Google Slides] [PDF slides]

Data Science in the wild @ TIP 2018 - I talked at the Trans-disciplinary Innovation Program of The Hebrew University of Jerusalem about the actualities of practicing data science in the wild, and lessons learned from the mistakes I made. [Google Slides]

Computing Nash equilibria in graphical games @ AlgoIL #1, 4.12.17 - A talk I gave at the first Algorithms Israel meetup, reviewing three different algorithms for finding Nash equilibria in graphical games. [Video] [Google Slides] [PDF slides]

Fuzzy Credit Networks @ Reversim 2017 - A talk I gave at Reversim 2017 about a variant I came up with to the concept of credit networks. [Video] [Abstract] [Google Slides] [PDF slides]

ODsL @ Reversim 2017 - I talked at Reversim 2017 about the ODsL project. [Video] [Abstract] [Google Slides] [PDF slides]

Handling data with Python: A hands-on workshop - A hands-on introductory workshop on how to handle data with Python using Jupyter Notebook, Numpy and Pandas. Given at DataHack 2016 and DataHack 2017. [Repository]

Quick & dirty data scienc with Python @ DataTalks #3, 15.3.17 - I presented a classification challenge we had at Neura, and how we tackled it using the simplest machine learning tools and some dirty heuristics to get a working system with good results in a short amount of time. [Event page]

Introduction to Network Analysis (with Python) @ Data Science Summit Europe 2016 Training Day, 5.6.16 - Together with Inbar Naor. An introduction to exploring network-structured datasets with the Python networkx package and Jupyter Notebook. [PDF slides] [Repository]

Fireside chat on Big Data @ Ir-Acdemia, Industry-Acedmia Event #2, 24.5.16 - Co-organized and co-hosted with Inbar Naor the Big Data Panel, interviewing Prof. Yossi Matias, VP Engineering @ Google, Prof. Menahem Ben-Sasson, HUJI President, and Prof. Yair Weiss, Head of the CS school. [Agenda]

Hackathons as an information sharing tool for companies and governmental organizations @ Teldan INFO 2016, 23-25.5.16 - Together with Inbar Naor. A presentation on the importance of hackathons as a community reach out avenue for various organizations. [PDF slides]