20 April 2006

Aleksander Molak - Collected Quotes

"An important concept in complexity science is emergence – a phenomenon in which we can observe certain properties at the system level that cannot be observed at its constituent parts’ level. This property is sometimes described as a system being more than the sum of its parts." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"Any time you run regression analysis on arbitrary real-world observational data, there’s a significant risk that there’s hidden confounding in your dataset and so causal conclusions from such analysis are likely to be (causally) biased." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"Expert knowledge is a term covering various types of knowledge that can help define or disambiguate causal relations between two or more variables. Depending on the context, expert knowledge might refer to knowledge from randomized controlled trials, laws of physics, a broad scope of experiences in a given area, and more." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"In statistical inference and machine learning, we often talk about estimates and estimators. Estimates are basically our best guesses regarding some quantities of interest given (finite) data. Estimators are computational devices or procedures that allow us to map between a given (finite) data sample and an estimate of interest." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"In summary, the relationship between different branches of contemporary machine learning and causality is nuanced. That said, most broadly adopted machine learning models operate on rung one, not having a causal world model." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"'Let the data speak'" is a catchy and powerful slogan, but [...] data itself is not always enough. It’s worth remembering that in many cases 'data cannot speak for themselves' and we might need more information than just observations to address some of our questions." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"Matching is a family of methods for estimating causal effects by matching similar observations (or units) in the treatment and non-treatment groups. The goal of matching is to make comparisons between similar units in order to achieve as precise an estimate of the true causal effect as possible." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"Multiple regression provides scientists and analysts with a tool to perform statistical control - a procedure to remove unwanted influence from certain variables in the model." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"Non-linear associations are also quantifiable. Even linear regression can be used to model some non-linear relationships. This is possible because linear regression has to be linear in parameters, not necessarily in the data. More complex relationships can be quantified using entropy-based metrics such as mutual information. Linear models can also handle interaction terms. We talk about interaction when the model’s output depends on a multiplicative relationship between two or more variables." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"The basic goal of causal inference is to estimate the causal effect of one set of variables on another. In most cases, to do it accurately, we need to know which variables we should control for. [...] to accurately control for confounders, we need to go beyond the realm of pure statistics and use the information about the data-generating process, which can be encoded as a (causal) graph. In this sense, the ability to translate between graphical and statistical properties is central to causal inference." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"The causal interpretation of linear regression only holds when there are no spurious relationships in your data. This is the case in two scenarios: when you control for a set of all necessary variables (sometimes this set can be empty) or when your data comes from a properly designed randomized experiment." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"The first level of creativity [for evaluating causal models] is to use the refutation tests [...] The second level of creativity is available when you have access to historical data coming from randomized experiments. You can compare your observational model with the experimental results and try to adjust your model accordingly. The third level of creativity is to evaluate your modeling approach on simulated data with known outcomes. [...] The fourth level of creativity is sensitivity analysis." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

"[...] the modularity assumption states that when we perform a (perfect) intervention on one variable in the system, the only structural change that takes place in this system is the removal of this variable’s incoming edges (which is equivalent to the modification of its structural equation) and the rest of the system remains structurally unchanged." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.