. 05 level - [quotes]
100% - [quotes]
ability - [quotes]
accuracy - [quotes]
adaptation - the action or process of adapting a technique to graphical representation contexts [quotes]
aggregation - [defs, quotes, quotes*]
alerts - interactive elements in the context of BI that convey important information, warnings, or notifications to users [quotes]
algorithm - [defs, quotes*]
genetic algorithm (GA) - [defs]
evolutionary algorithm - [defs]
ambiguity - [quotes]
analogy - [quotes*]
analysis - [quotes*]
business analysis - [defs]
cluster analysis - [defs]
data analysis [defs, quotes]
meta-analysis [defs]
regression analysis - [quotes*]
appropriateness - [quotes]
approximation - [quotes, quotes*]
art - any form of visual artistic representation [quotes]
assumption - an aspect accepted as true [quotes*]
attention - [quotes]
audience - [quotes]
audit - [defs]
audit trail [defs]
axes - the horizontal and vertical axes of a graph [defs, quotes]
background - [quotes]
bar graph/chart - chart that uses either horizontal or vertical bars (column chart) to show discrete, numerical comparisons across categories. [defs, quotes]
column bar/chart - [quotes]
radial column chart (aka circular column graph)
- graph that uses a grid of concentric circles to plot bars on, each
circle representing a value on a scale, while the radial dividers (lines
spanning from the center) are used for each category or interval (if a
histogram).
piled bar chart - a layered bar chart design where all the bars are sorted and aligned on a single, shared axis.
radial/circular bar chart - a bar chart plotted on a polar coordinate system, rather than on a Cartesian one.
stacked bar graph -
wrapped bar chart - 1) a chart that splits the sorted
bars in a horizontal bar chart into multiple columns to eliminate the
need for scrolling (Stephen Few); 2) a chart that wraps
disproportionately large bars so that small values are discernible
error bar - graphical tool used in data statistics to display the uncertainty, variability, or margin of error associated with a specific data point
belief - [quotes*]
bias - the distortion of a statistical result or fact [quotes, quotes*]
bin - the unit of grouping values together [quotes]
binning - a way to group or aggregate numbers or continuous values into a number of bins
board -
BI board - [defs]
BI Competency Center [BICC]
box
black box - an opaque system, algorithm, dataset or model where the internal logic and processing steps are hidden or too complex for human interpretation [quotes*]
white box - an explainable or transparent AI/ML model used to generate charts and graphics. [quotes]
business intelligence [BI] - [defs, quotes]
operational intelligence - [defs]
self-service BI [SSBI] - [defs, quotes]
category - a class or division of people or things having certain shared characteristics [quotes]
categorization - the action or process of putting items together into classes or groups.
causality - [quotes*]
centrality - [quotes*]
averages - [quotes*]
moving average - an indicator that aggregates the values from the neighborhood of a point, divided by the total number of points considered
weighted average -
central limit theorem - [quotes*]
central tendency [quotes*]
certainty - [quotes*]
uncertainty - the visual representation of the imprecision, variability, or margin of error within a dataset [quotes, quotes*]
chance - [quotes*]
change - the act or process through which a variable or aspect is transformed [quotes*]
chart - a graphical representation of data by predefined graphical means [quotes]
bubble chart - multi-variable graph used to analyze patterns/correlations between variables on terms of positioning and proportions. It uses a Cartesian coordinate system to plot the variables in which each point is assigned a label or category (either displayed alongside or on a legend), respectively a third variable quantified as the area of its circle. Colors can be used to distinguish between categories or to represent an additional data variable [quotes]
bullet chart - see bullet graph
[Japanese] candlestick chart - trading tool used to visualize and analyze the price movements over time period
charts vs. words [quotes]
donut chart - variation of a Pie Chart in which an area of the center is cut out. [link 1, quotes]
dot matrix chart - display method in which the discreet data are displayed in units of dots, each colored to represent a particular category and grouped together in a matrix.
flow chart (aka flow diagram, flow process chart, process chart, process map) - type of diagram that represents/abstracts the sequential steps of a process [defs, quotes]
Gantt chart - project management organization tool that displays a list of activities (or tasks) with their duration over a timeline and tracks further information needed in the process. [defs, quotes]
horizon chart -
Kagi chart - time-independent chart used to display the general levels of supply and demand of a particular asset by visualizing the price actions through a series of line patterns.
lying with charts [quotes]
mosaic plot (aka Marimekko chart - chart used to visualize categorical data over a pair of variables with a percentage scale.
open-high-low-close (OHLC) chart) - chart used to illustrate the movements in prices
parallel set charts -
Pareto chart - [defs]
PERT chart - [defs]
pictogram chart (aka pictograph/pictorial chart) - chart that uses icons to give a more engaging overall view of small sets of discrete data.
pie chart - chart that shows percentages as sectors (slices) of a circle (and thus resembling a pie) [p1, quotes]
point & figure (P&F) chart - chart used to display the relationship between supply and demand of a particular asset through a series of columns made up of X's and O's.
point chart - see scatterplot
proportional area chart -
radar chart (aka spider/web chart, polar chart) - a way of comparing multiple quantitative variables.
rose chart (aka coxcomb chart, polar area diagram) - a polar coordinate grid that displays categories of data within equally divided segments.
span chart (aka range/column bar) - chart used to display dataset ranges between a minimum value and a maximum value.
sparkline - a tiny, highly compact chart that fits inside a single cell or line of text, providing a quick, at-a-glance visual summary of data trends *spikes, drops, or economic cycles) without requiring the space of a traditional, full-sized chart [defs, quotes]
tally chart -
timeline chart - a chart that illustrates a series of events, tasks or milestones in chronological order [quotes]
trellis chart - a chart which displays a series of sub-charts that use the same scale and axes [quotes]
XY chart - a visualization that allows studying the relationship between numeric variables.
circles - [quotes]
circle packing (aka circle treemap) - variation of a treemap that uses circles instead of rectangles, each circle representing a level in the hierarchy.
clarity - the quality of being coherent and intelligible [quotes]
classification - [defs, quotes*]
classification tree - [defs]
classifier - [defs, quotes]
clearness - [quotes]
cluster - a group of similar items positioned or occurring closely together [defs, quotes]
cluster analysis - [defs, quotes]
clustering - [quotes*]
K-Means algorithm - [defs]
clutter - a group of graphical elements that tend to obscure main elements of a graphical representation [quotes]
cognition
understanding - the cognitive process of transforming raw, abstract data into meaningful, actionable insights by leveraging the human brain's visual perception system [quotes]
coincidence - [quotes*]
common sense - [quotes*]
communication - the process of imparting or exchanging data, information or knowledge within a medium via predefined channels [quotes]
comparison - the act or process of identifying similarities or dissimilarities between distinct items [quotes]
completeness - [quotes]
complexity - the state or quality of being intricate or complicated [quotes, quotes*, quotes**]
composition - [quotes]
computation - [quotes]
conclusion - [quotes, quotes*]
confidence - [quotes*]
confidence interval - [quotes*]
confirmation - [quotes]
connection - [quotes]
continuity - [quotes]
control - [quotes*]
convergence - [quotes*]
coordinates - a set of numbers which are used to determine the position of a point or a shape in a n-dimensional plane (usually a 2- or 3-dimensional space, though one may choose to work in higher dimensions) [quotes]
color - the property of objects of producing different sensations on the eye as a result of the way it reflects or emits light [quotes]
communication - the purposeful process of exchanging signals, data, information or knowledge [quotes]
comparison - identifying the similarities or dissimilarities between entities
composition - the process of arranging something into specific proportion or relation
computation - a well-defined type of arithmetic or non-arithmetic calculation [quotes*]
conclusion - a judgement or decision reached by reasoning [quotes*]
confidence - [quotes*]
confusion - the state of being unclear about something [quotes]
connections - [quotes]
constraint - [quotes*]
continuity -
control - [defs, quotes*]
control flow - [defs]
convergence - [defs, quotes]
coordinate - [quotes]
coordinate system - a combination of one or more dimension used to represent the various elements (points, areas)
creativity - [quotes*]
criteria - [quotes*]
criticality - [quotes]
criticism - [quotes]
culture - [quotes]
curves - [quotes]
curvature - [quotes]
fractal - a curve or geometrical figure for which each part has the same inherent characteristics as the whole [defs, quotes]
kurtosis - the sharpness of the peak of a frequency-distribution curve [quotes]
slope - angle or steepness of a line segment [quotes]
data - [defs, quotes*]
big data [defs*, quotes*]
data characteristics -
density - the degree to which data points are tightly packed within a given area of a chart [quotes]
integrity - [defs]
domain integrity - [defs]
master data - [defs]
master data management (MDM) - [defs]
metadata - data about data [defs, quotes]
missing data - [quotes, quotes*]
multivariate data - [quotes]
reference data - [defs]
semi-structured data [defs]
structured data - [defs]
transactional data - [defs]
unstructured data [defs]
data analysis - [defs, quotes]
data analyst - [quotes*]
exploratory data analysis (EDA) [defs, quotes]
trend analysis [defs]
data analytics - [defs*, quotes]
advanced analytics [defs]
big data analytics [defs]
business analytics [defs]
descriptive analytics [defs]
predictive analytics [defs]
prescriptive analytics [defs]
real-time analytics [RTA[ [defs]
real-time operational analytics [RTOA]
text analytics [defs]
trend analysis [defs]
data architecture - [defs, quotes]
data architect [DA] -
data asset - [defs]
data contract - [defs]
data dictionary - [defs]
data element - [defs]
data fabric - [quotes]
data flow - [defs]
data format - the standardized structure or blueprint used to encode, organize, and store data in a data based, or other type of repository.
Parquet format - an open-source, column-oriented file format designed for efficient data storage and retrieval in big data analytics. [notes]
Delta Parquet - an open-source storage layer built on top of Apache Parquet format that combines the highly efficient, compressed, and columnar storage of Parquet with a transaction log to bring enterprise reliability, ACID transactions, and version control to data lakes
data governance - [defs, quotes]
data lifecycle - [defs]
data lineage - [defs]
data literacy - [defs, quotes*]
data management - [defs, quotes]
data security - [defs]
data protection - [defs]
encryption - [defs]
data mining - [defs, quotes*]
text mining [defs]
data mining model - [defs]
data model - [defs, quotes*]
all models are wrong [quotes*]
data modeling - [defs]
good models [quotes*]
the truth in models [quotes*]
data navigation -
drill down - [defs]
drill up - [defs]
data operation - the operational management of data systems to ensure data is stored, protected, and accessible.
data collection - [defs, quotes*]
data compression - [defs]
data discovery - [defs]
data exploration - [defs, quotes, quotes*]
data extrapolation - [quotes]
data integration - [defs]
data mapping - [defs]
data matching - [defs]
data migration - [defs]
data preparation - [defs, quotes]
data processing - [defs, quotes]
online analytical processing [OLAP] - [defs]
online transactional processing [OLTP] - [defs]
data profiling - the initial process of analyzing raw data to understand its structure, quality, and content before further use [defs, quotes]
data scrubbing - [defs]
data sharing - [defs]
data standardization - the process of converting inconsistent raw data into a uniform, comparable format [defs, quotes]
data strategy - [defs, quotes]
data transformation - the process of converting raw, unstructured, or incompatible data into a clean, structured, and (visually) meaningful format [defs]
data pipeline - a collection of scripts, functions or other elements that pass data along in a series of transformations. [defs, quotes]
decoding - [quotes]
encoding/decoding - [quotes, quotes*]
extract, load, transform (ELT) - [defs]
extract, transform, load (ETL) - [defs]
data validation - the process of verifying that the data is accurate, consistent, and adheres to predefined quality standards [quotes]
cross-validation - [defs]
data virtualization - is the process of creating a unified, virtual layer that lets you access and query data from multiple disparate sources (like SQL databases, cloud apps, and data lakes) without physically moving or copying it [defs]
data wrangling - the process of cleaning, structuring, and mapping raw data into a usable format before visualization or further use
data visualization - [defs, quotes*]
data stories - [quotes*]
data storytelling - [quotes]
storytelling - the practice of combining data, visuals, and narrative to communicate complex analytical insights [quotes]
myth - [quotes, quotes*]
narrative - [quotes]
story - the narrative structure that connects raw numbers to human understanding [quotes]
decorations - [quotes]
elements -
area - region within a graphical representation, respectively a measure of region's size [quotes]
angle - a visual cue used to represent numerical data, typically as parts of a whole. It determines the size of slices in pie charts and donut charts, where a data point’s proportion dictates the central angle of its corresponding slice. [quotes]
break an intentional gap or discontinuity on a chart’s axis [quotes]
bullet - [quotes]
grid - a structured layout that organizes information (e.g., background reference lines) [quotes]
emphasis - the design principle of using contrast to guide the viewer’s attention directly to the most important data point, trend, or insigh [quotes]
illustration - the use of pictorial graphics, drawings, or visual embellishments to complement and explain data characteristics [quotes]
gap - a missing, incomplete, or unavailable portion of data that creates a break or discontinuity in a chart [quotes]
style - the aesthetic and functional design choices that shape how data is presented [quotes]
visibility - the clarity, prominence, and perceptual ease with which key data points, patterns, or dashboard components are displayed [quote]
shade - a color mixed with black, making it a darker variation of a base hue, primarily used to represent quantitative data, create gradients, or highlight specific elements without overwhelming the design [quotes]
shape - the geometric markers (e.g. circles, squares, or triangles) used to represent categorical data, or the overall distribution pattern of quantitative datasets (e.g. symmetric or skewed curves [quotes]
scorecard - a compact display designed to summarize a KPI, explicitly showing current performance or progress associated with a set target or goal [defs, quotes]
timeline - a graphical way of displaying a list of events in chronological order
visual - 1) an image or visualization created from data used standalone or accompanying textural or other form of content [quotes]
visual calculation - a type of calculation that's defined and executed directly on a visual
aesthetics - a set of principles concerned with the nature and appreciation of beauty in graphical representation [quotes]
beauty - the intersection of function and aesthetics. It refers to designs where visual elegance - through color, layout, and typography - actively enhances the viewer's ability to interpret data, spot patterns, and grasp insights intuitively. [quotes, quotes*]
function - the primary purpose, utility, or role a visual graphic plays in translating complex information into easily digestible, actionable insights.
data point - [quotes]
deltas (Δ) - the change or difference between two data points or states that measures the impact, growth, or deviation between variables, baselines, periods, etc.
inflection point -
data quality - [defs, quotes]
data cleaning (aka data cleansing) - [defs, quotes*]
data quality dimension - [defs]
accessibility - [defs]
accuracy - [defs, quotes*]
availability - [defs, quotes*
completeness - the state or condition of having all the necessary or appropriate parts or characteristics [defs]
complexity - the characteristic of something that make it difficult to understand [quotes*]
conformity - [defs]
consistency - [defs, quotes*]
granularity (aka data granularity) - [defs]
fuzziness - the quality of being indistinct and without sharp outlines [quotes*]
integrity - [quotes]
referential integrity- [defs]
timeliness - [defs]
validity - the degree to which a visual accurately represents the underlying data and translates it into meaningful, truthful insights without distortion, bias, or misleading design choices [defs]
data quality management - [defs]
data validation - [defs]
cross-validation - [defs]
dirty data - [defs]
data repository
data lake - [defs, quotes]
data lakehouse - [quotes]
data mart - [defs, quotes]
data mesh - [defs, quotes]
data product - [defs, quotes*]
data store - [quotes]
operational data store (ODS) - [defs]
data warehouse - [defs, quotes]
data warehousing - [defs]
delta lake - [defs]
data roles -
data steward - [defs]
data stewardship - [defs]
data science - [defs*, quotes*]
conjecture - an opinion or conclusion formed on the basis of incomplete information or without proof [quotes*]
data augmentation - [defs]
data classification - [defs, quotes]
k-nearest neighbors [defs, quotes]
data scientist - [quotes*]
simulation - [defs, quotes*]
Monte Carlo simulation [defs, quotes]
data mining -
text mining - [defs, quotes]
data structure - [defs]
dataset - [quotes*]
matrix - [quotes]
snapshot - a frozen, read-only view of data captured at a specific point in time [defs]
tuple -
vector - the underlying mathematical structure of the data points being plotted [quotes]
database - [defs, quotes]
database design - [defs]
database objects
index - [quotes]
function - a relation or expression that involves one or more variables, respectively an outcome uniquely defined based on the input values
hash - [defs]
hashing - [defs]
hash function - a function whose output values are all the same (or nearly the same) for the same input values [defs]
system function -
user-defined funtion - routines that accept parameters, perform an action, such as a complex calculation, and return the result of that action as a value. [more]
schema - data's underlying logical structure and organization [defs]
database schema - [defs]
snowflake schema - [defs]
star schema - [defs]
table - [quotes]
delta table - an enhanced, high-performance data table format built on top of standard Parquet files [notes]
dimension table -
fact table - [defs]
in-memory table -
system table -
temporary table (aka temp table) -
time table -
sequence - a user-defined, schema-bound database object that generates a series of numeric values according to a defined specification.
stored procedure - precompiled SQL code that saved as a (user-defined) objectand reused
view - a specific visual arrangement or configuration of data on a screen, which allows analysts to explore data relationships or communicate insights [quotes]
database management system (DBMS) - [defs]
operational database - [defs]
deception - [quotes]
decision - [quotes*]
decision-making - [quotes, quotes*]
decision support - [defs]
decision support system - [defs, quotes]
decision theory - [quotes*]
definition - [quotes*]
design - [quotes]
good design - [quotes]
designer - [quotes]
diagram - [quotes]
arc diagram - method of representing two-dimensional network diagrams to identify co-occurrences in data. Nodes are placed along a single line (a one-dimensional axis) and arcs are used to show connections between those nodes, where the thickness of each arc line can be used to represent frequency between the source and target node [quotes]
causal loop diagram [CLD] - causal diagram that aids in visualizing how different variables in a system are causally interrelated [defs]
cause-effect diagram - a visual tool
used to logically organize possible causes for a specific problem or
effect by graphically displaying them in increasing detail [quotes]
Chord diagram - type of diagram used to visualizes the
inter-relationships between entities that have something in common
(ideal for comparing the similarities within a dataset or between
different groups of data.
data flow diagram - [defs]
dendrogram - a diagram that shows the hierarchical relationship between objects.
diagramming - [quotes]
histogram - diagram that allows to visualize the
distribution of data over a continuous interval or certain time period,
each bar representing the tabulated frequency at each interval/bin.
Besides providing a rough view of the probability distribution,
histograms help estimate where values are concentrated, what the
extremes are, respectively whether there are any gaps or unusual values.
[defs, quotes]
multiples (aka small multiples) - multiple images or diagrams of the same quantity for different times or parameter values [quotes]
node-link diagram) - network-like visualization that
shows how things are interconnected through the use of nodes/vertices
and link lines that represent their connections and relationship type.
polar area diagram - see rose chart
Sankey diagram - flow diagram that emphasizes flow/movement/change from one state to another or one time to another, respectively their quantities in proportion to one another.
spike histograms - a variation of a traditional histogram where frequencies are represented by thin vertical lines (spikes) rather than thick, solid bars.
spinogram (spine histogram) - extends a standard histogram by varying the width of the bars to represent frequencies, rather than just their heights
sunburst diagram (aka sunburst chart, ring chart, radial treemap) - visualization that shows a hierarchy through a series of rings, that are sliced for each category node. Each ring corresponds to a level in the hierarchy, with the central circle representing the root node and the hierarchy moving outwards from it.
tree diagram - [quotes]
Venn diagram (aka set diagram) - a diagram that visually displays all the possible logical relationships between a collection of sets [quotes]
dimension - [quotes]
cardinality - the number of elements corresponding to a dimension
conformal dimension - [defs]
degenerate dimension - [defs]
dimension hierarchy - [defs]
dimensionality - [quotes, quotes*]
role-playing dimension - [defs]
discovery - [quotes*]
discrete -
effectiveness - [quotes]
error - [quotes*]
type I error - [defs]
type II error - [defs]
errors in statistics - [quotes]
blunders - [quotes*]
distortion - [quotes]
misleading - [quotes]
mistakes - [quotes]
estimation - [quotes*]
parameter estimation [defs]
excellence - [quotes]
execution - [quotes]
executives - [quotes]
experience - [quotes, quotes*]
experiment - [quotes*]
explanation - [quotes]
extreme(s) - [quotes]
event - [quotes*]
rare event - [quotes*]
evidence - [quotes*]
fact - a thing that is known or proved to be true [quotes*]
failure - an act or instance of failing or proving unsuccessful [quotes, quotes*]
feature - a typical or important quality, characteristic or part of something broader [quotes*]
feature extraction - [defs]
feature selection - [defs]
figure - 1. a number that is part of (official) statistics or relates to the financial results 2. geometric shape that is a combination of lines, points, or planes [quotes]
cube - [defs]
funnel - a multilayered tube-like device that is wide at the top and narrow at the bottom, used for guiding and filtering the content toward the bottom, typically as part of a multistep process, though not everything that goes in may reach the bottom
fitting - [quotes*]
overfitting - [defs, quotes*]
underfitting - [quotes*]
flow - [quotes]
fluctuation - an irregular rising and falling in number or amount [quotes]
forecast - a calculation or estimate of future events [defs, quotes*]
forecasting methods - [defs]
Delphi method - [defs]
framework -
grammar - systematic framework for building graphics by breaking every visualization into a set of fundamental, reusable components [quotes]
structure - the underlying logical framework used to organize, map, and present data points so the human brain can interpret them efficiently [quotes]
generalization - a general statement or concept obtained by inference from a set of specific cases, statements, information, knowledge, etc. [defs*, quotes*]
goals - an aim or desired result [defs, quotes, quotes*]
good, bad, ugly -
bad - [quotes]
good - [quotes]
ugly - [quotes]
graph - a network of points connected by lines showing the relationships between elements [def, quotes]
area graph - variation of line graphs in which the area below the line is filled with colors or textures; used to display the development of quantitative values over an interval or time period [quotes]
stacked area graphs - graphical representation form that depicts how parts of a whole change over time, showing both the cumulative total and the contribution of each component within that total
braided graph - chart in which the filled areas are sorted in depth order for each position along the time axis.
bullet graphs - a streamlined data visualization designed to track progress against a target or benchmark; developed by Stephen Few as an alternative to dashboard gauges and meters
line graph - used to display quantitative values over a continuous interval or time period. [quotes]
network graph - see network diagram
nomograph (nomogram) - graphical calculating tool that allows solvng a specific mathematical equation visually by using a straightedge to connect known values on two or more scaled lines [quotes]
point graph - see scatterplot
tree - a graphical representation of hierarchical data that displays relationships where a single origin point (the root) connects to child items, which then branch out further [defs]
decision trees - [defs, quotes*]
graphic - [def, quotes]
bad graphics quotes]
good graphics [quotes]
infographic - a visual representation of information or data in a graphical form [defs, quotes]
Hadoop - [defs]
headline - [quotes]
hierarchy - [quotes]
hypothesis - [quotes, quotes*]
null hypothesis [quotes*]
hypothesis testing - [quotes*]
icons - [quotes]
idealization - the action of regarding or representing something under ideal circumstances [quotes, quotes*]
images -
improvement - [quotes]
independence - [defs, quotes*]
indicator - [defs, quotes]
index number [quotes]
lagging indicator - [defs]
leading indicator - [defs]
performance indicator (PI) - [defs, quotes]
key performance indicator (KPI) - [defs, quotes, posts]
reference indicator (RI) - [defs, quotes]
key reference indicator (KRI) - [defs, quotes, posts]
induction - [quotes]
information - [defs, quotes, quotes*]
information design - [quotes]
information overload - [defs]
ink - [quotes]
inquiry - [quotes]
intelligence - [quotes*]
Artificial Intelligence (AI) - the capability of a computer or machine to perform tasks that normally require human intelligence (e.g. learning, reasoning, problem‑solving, perception, decision‑making) [defs, quotes, quotes*]
business intelligence [defs, quotes]
swarm intelligence [defs, quotes]
interpretation - [quotes]
misinterpretation [quotes]
intuition - [quotes*]
inquiry - [quotes*]
invariance - [quotes*]
iteration - [quotes*]
junk chart - elements of a chart or graph that do not add value to the data being presented, but risk to distract or confuse the reader [quotes]
laws - [quotes*]
knowledge - [quotes]
label - [quotes]
lakehouse - [defs, quotes]
language - [defs, quotes]
natural language processing (NLP) - [defs, quotes]
leadership - [quotes]
learning - [defs, quotes]
deep learning [defs, quotes*]
learning transfer - [quotes*]
machine learning (ML) - [defs, quotes]
boosting - ML technique that converts a set of simple, low-accuracy models (aka weak learners) into a single, highly accurate predictive model (aka 'strong learner'). sequentially, where each new model is specifically trained to correct the errors made by the models that came before it [quotes*]
reinforcement learning - [quotes*]
semi-supervised learning [defs]
supervised learning [defs]
support vector machine [SVM] - a supervised machine learning algorithm used for classification and regression tasks, particularly used in finding the optimal boundary (a hyperplane) to separate data into different classes by maximizing the margin between them [defs, quotes]
unsupervised learning [defs]
least squares - [quotes*]
linearity - [quotes, quotes*]
nonlinearity - [quotes*]
location - [quotes*]
logarithm - [quotes*]
logarithmic chart - a chart that uses the logarithmic scale [quotes]
logarithmic scale - method used to display numerical values that spans a broad range of values in a nonlinear form
logic -
bimodal logic [defs]
fuzzy logic [defs]
magnitude - [quotes]
gradient - an increase or decrease in the magnitude of a property or characteristics observed in passing from one point or moment to another [defs, quotes]
map - a diagrammatic representation of a geographical area [quotes]
bubble map - a map in which circles are displayed over a designated geographical region with the area of the circle proportional to its value in the dataset. Used for comparing proportions over geographic regions, though large bubbles can overlap other bubbles and regions on the map [quotes]
Choropleth map - a map that allows to visualize values
over a geographical area or regions that are colored, shaded or
patterned in relation to a data variable. [quotes]
circle treemap see circle packing
connection map (aka link map, ray map) - map in which the points are connected by straight or curved lines, and used to display connections and relationships geographically, respectively chain of links.
density map - data visualization technique that uses color gradients to represent the concentration or intensity of data points across a given area [quotes]
dot map (aka point map, dot distribution map, dot density map) - a thematic data visualization used to show the geographic distribution and density of data across a region.
flow map - diagrams that shows the movement of information or objects from one geographical location to another and the corresponding values. [defs]
heatmap - data visualization technique that uses colors to represent the magnitude or intensity of data values [defs, quotes]
knowledge map - a visual or structural representation of how different concepts, data points, or information assets relate to one another [quotes]
mind map - a concept-based diagram used to map associated ideas, words, images and concepts together. [defs]
network map - visual representation of the physical or logical layout of a network-like structure (see network diagram)
process map - see flow chart
self-organization map - an unsupervised machine learning technique used to visualize and cluster complex, high-dimensional data [defs]
tree map - a data visualization tool that displays hierarchical (tree-structured) data using nested rectangles. [quotes]
MapReduce - core programming model used to process massive datasets in parallel across distributed clusters [defs]
matching - [defs, quotes*]
meaning - [quotes]
measure - a quantifiable, numerical value used to track, analyze, or calculate data [defs]
correlation - measure that shows how much one set of values depends on another. If the values increase together, they are positively correlated. If the values from one set increase as the other decreases, they are negatively correlated. There is no correlation when a change in one set has nothing to do with a change in the other. [defs, quotes*]
deviation - meatures that shows how much a value differs from a reference point (e.g. target, average, forecast, zero)
standard deviation [defs, quotes*] - meatures that shows how spread out the data is around the mean.
distance - mathematical score that measures how distant and occasionally how similar/dissimilar two data points are [quotes]
overlapping - instances where data points, datasets, or time segments share commonalities, cover the same space, or contain redundancies [quotes]
skewness - measures of asymmetry of a data distribution [quotes]
measurement - the systematic process of observing, capturing, and assigning numerical values or categories to real-world phenomena, variables, or events [quotes, quotes*]
metric - a calculated, quantifiable measure used to track and assess the performance, success, health or other aspect of a specific business activity or process over time [defs, quotes]
proximity - the state of being near in space, time or relationship (one point finds itself in the proximity of another) [quotes]
mental elements -
message - [quotes]
metaphor - a thing or characteristic regarded as representative or symbolic of something else [quotes]
mental processes
deduction - [quotes*]
heuristic - [defs, quotes*]
guessing - [quotes*]
method - [quotes, quotes*]
scientific method - [quotes*]
models [defs, quotes*]
data model - [defs, quotes]
dimensional model - [defs]
mathematical model - [defs, quotes]
mathematical modeling - [defs, quotes*]
metamodel - [defs, quotes]
modeling - [quotes]
dimension modeling - [defs]
statistical modeling - [quotes*]
probabilistic models - [quotes*]
simulation model - [defs]
truth in models [quotes*]
bad models [quotes*]
good models [quotes*]
network -
Backpropagation network
Bayesian network [defs, quotes*]
generative network [defs]
generative adversarial network [defs]
neural network (NN) [defs, quotes*]
artificial neural network (ANN) [defs]
Boltzmann machine [defs, quotes]
convolutional neural network (CNN) [defs]
recurrent neural network [defs]
semantic network [defs]
supervised network [defs]
semi-supervised network [defs]
unsupervised network [defs]
neural -
neuron - [defs]
normality - [quotes*]
normalization - [defs*]
numeracy - [quotes]
objectivity - [defs, quotes]
observation - [quotes*]
opportunity - [quotes]
optimization - [defs, quotes*]
order - [quotes]
organization - [quotes]
self-organization [defs, quotes]
orientation - [quotes]
origin - [quotes]
outlier - value that is atypical within a particular group, class, or category [quotes, quotes*]
overfitting - occurs when a model fits too closely or even exactly to its training data, and thus it can’t make accurate predictions or conclusions when data with new characteristics is introduced [defs*, quotes*]
overplotting - scenario in which the data or labels in a data visualization overlap, making it difficult to distinguish between the individual data points. It occurs when there are either a large number of data points and/or a small number of unique values in the dataset.
p-value - [quotes*]
parameter - [quotes*]
partition - [defs, quotes]
parts-to-whole - [quotes]
pattern - [defs, quotes, quotes*]
pattern recognition [defs]
percentage - [quotes]
percentiles & quantiles - [quotes*]
perceptron - [defs, quotes]
multilayer perceptron [defs]
perfection - [quotes]
perspective - [quotes]
perception - [quotes]
perspective - [quotes]
pictures - [quotes]
pitfalls - [quotes]
pivot/unpivot - [defs]
planning - [defs, quotes, quotes*]
plausibility - [quotes*]
plot - [quotes]
box plot - data visualization used to display the data distribution through their quartiles [quotes, quotes*]
density plot (aka kernel density plot, density trace graph) -
dot plot - [quotes]
jump plot -
mosaic plot - see mosaic
parallel coordinates plot - type of visualization used for plotting multivariate, numerical data and thus allow for comparing many variables together and seeing the relationships between them
plotting - [quotes]
scatterplot (aka scatter/point graph/chart, X-Y plot) - [quotes]
spineplot - variation of bar chart that uses normalized bar lengths while the bar widths are proportional to the number of values in the category.
spiral plot (aka time series spiral) - type of visualization plots time-based data along an Archimedean spiral.
stem and leaf plot (aka stemplot) -
violin plot - the combination of a pox and density plots
used to visualize the distribution of the data and its probability
density [quotes]
population pyramid - a pair of back-to-back histograms (for each sex) that displays the distribution of a population in all age groups and in both sexes.
position - [quotes]
power - [quotes]
precision - [defs, quotes, quotes*]
prediction - [quotes, quotes*]
presentation - [quotes]
principle - [quotes*]
probability - [quotes*]
frequency - [quotes]
problem - [quotes, quotes*]
problem solving - [quotes*]
procedure - a set of actions that is the official or agreed way of doing something [defs]
statistical procedure - [defs]
bootstrapping - [defs, quotes]
process - a series of actions or steps taken in order to achieve a particular end [quotes]
stochastic process - [defs, quotes]
Markov process - [defs]
projection - an estimate or forecast of a future situation based on a study of present trends, circumstances or other factors [quotes]
propagation - the spreading of something, e.g. signals, data, information, knowledge [quotes]
backpropagation - [ML] a gradient estimation method commonly used for training a neural network to compute its parameter updates [defs]
forward propagation - [ML] the process where raw input data is pushed forward through the network to generate an output or prediction.
quantitative vs. qualitative - [defs, quotes]
quality - [defs, quotes, quotes***]
puzzles - [quotes*]
randomization - [quotes]
randomness - [quotes*]
rating - [quotes]
reading - [quotes]
reality - [quotes]
record -
system of record (SoR) - the authoritative, trusted software or database where an organization’s original data is created, governed, and maintained [defs]
rectangles - [quotes]
regression - the graphical representation of statistical models used to estimate the relationship between a dependent variable (the outcome) and one or more independent variables (the predictors) [defs, quotes*]
regression - [quotes*]
regression analysis - [defs, quotes]
linear regression [defs]
logistic regression [defs]
regression toward the mean - [quotes*]
regularity - the use of patterned structures, symmetry, and repeating elements to help the human brain quickly identify patterns, trends, and outliers [quotes*]
regularization - a MLtechnique used to prevent models from overfitting that works by adding a penalty to the loss function during training, which discourages the model from becoming overly complex [defs, quotes]
relation - [quotes*]
relevance - [quotes]
reporting model - [defs]
representation - [quotes]
resolution - [quotes]
research - [defs, quotes, quotes*]
residual - the difference between an actual observed data point and the value predicted by a statistical model [quotes*]
resolution - the level of detail, granularity, or frequency at which data is collected and plotted [quotes]
responsibility - [quotes]
results - [quotes]
risk - [quotes*]
risk analysis - [quotes]
sensitivity analysis [defs]
rules - [quotes]
business rule - [defs]
fuzzy rule - [defs]
rulings - [quotes]
sample - [quotes*]
sampling - [defs, quotes*]
scale - [quotes]
science - [quotes]
the science in data science - [quotes]
semantic - the meaning and context of the data, rather than just its visual appearance or raw numbers [defs, quotes]
sensitivity - the degree of responsiveness of a model or chart's output to changes in its input variables [quotes]
sensitivity analysis - [defs, quotes]
series (aka time series) - the graphical representation of data points collected sequentially over uniform or consistent intervals of time [quotes]
set - [defs]
fuzzy set [defs, quotes]
training set [defs]
signal - the core underlying message, trend, or meaningful pattern within a dataset [quotes, quotes*]
significance - [quotes*]
similarity - [defs, quotes]
self-similarity [defs]
simplicity - [quotes, quotes*]
simplification - [quotes]
size - [quotes]
small multiples - a collection of miniature illustrations displayed and perceived as one diagram
smoothing - a statistical technique used to remove 'noise' and short-term fluctuations from a dataset, making underlying trends, patterns, and cycles much easier to see [quotes]
report snapshot - [defs]
software - [quotes*]
speech recognition - [defs]
spikes - a sudden, sharp, and temporary deviation in a data plot, usually a steep peak or a deep drop [quotes]
staging (area) - [defs]
standard deviation - a visual metric that shows how spread out data points are around their average (mean) [defs, quotes]
standard - a set of core principles designed to make information accurate, clear, and easy to interpret at a glance [quotes]
standardization - the process of converting data to a common format to enable users to process and analyze it [quotes]
dispersion - [quotes*]
statistics - [quotes, quotes*]
Bayesian statistics - [quotes*]
descriptive statistics - [defs]
statistical significance - proves that an observed pattern or difference in data is genuine and unlikely to have occurred by random chance. [quotes]
statistician - [quotes*]
(mis)usage - [quotes]
usage - [quotes]
distribution - [defs, quotes, quotes*]
Gaussian distribution - [defs]
normal distribution - [defs, quotes[
weights - statistical adjustments applied to data points to ensure they accurately represent a broader population [quotes, quotes*]
success - the degree to which a visual representation allows the audience to quickly comprehend trends, recognize outliers, and make informed decisions without confusion [quotes, quotes*]
summary - an aggregation value based on the data that condenses large, complex datasets into easily understandable visual formats [quotes*]
summary statistics - the use of aggregated numerical measures (e.g. means, medians, standard deviations) to represent large datasets in visual formats
symbol - a sign, a mark or character used as a conventional representation of an object, function, process or other similar purpose [quotes]
symmetry - the property of an object that remains unchanged after being reflected, rotated or slid [quotes, quotes*]
asymmetry - [quotes *]
test - [def]
non-parametric test - [defs]
testing - [quotes]
theorems -
central limit theorem - [quotes*]
there's no free lunch - [quotes*]
theory - [quotes*]
time - the graphical representation of data points collected sequentially over chronological periods (days, months, years) to reveal patterns, trends, and anomalies [quotes]
time-table - a graphical representation that provides a time-baed display of a set of data points that typically have a certain cyclical pattern [quotes]
time series - any data plotted over chronological order, usually at regular intervals [defs, quotes*]
title - primary line of text that provides the viewer in one short sentence what the data represent, what the chart is about, respectively what the main takeaway or subject is [quotes]
subtile - secondary line of text that provides context the viewer needs to interpret the chart correctly, clarifying details that don’t belong in the title but are essential for understanding the data
tool - [quotes, quotes*]
domain - a grouping of data, technology, teams and/or people considered for the same analytical purposes [defs, quotes*]
total -
rolling totals -
training data - [quotes*]
transformation - the process of modifying, restructuring, or mathematically converting raw data to make it compatible with visual formats and easier to interpret [quotes, quotes*]
trend - the general, long-term direction (upward, downward, or steady) of a variable over time or across a sequential dataset [quotes]
trust - [quotes*]
truth -
single version of truth [SVoT]/single source of truth [SSoT] - data identified as trusted and authoritative for key metrics [defs, quotes]
lying with Statitics - the use of deceptive design choices in charts and graphs to distort the underlying data and manipulate the viewer's perception [quotes*]
torturing the data in statistics - the unethical practice of manipulating, analyzing, or mining data until it yields the desired conclusion [quotes]
truth in models [quotes]
usefulness - [quotes]
user - [quotes*]
value - [quotes]
values - [quotes]
variable - [quotes, quotes*]
continuous variable - [quotes]
discrete variable - [quotes]
variability - the graphical representation of how spread out or dispersed data points are within a dataset [quotes*]
variance - the spread of data points [defs, quotes*]
variation - how spread out, dispersed, or varied data points are from a central average [quotes, quotes*]
versioning - [quotes**]
viewer - the person interpreting the data (audience) or a software component used to display the charts [quotes]
volume - an amount or quantity of something [quotes]
word-cloud (tag cloud) - method that displays how frequently words appear in a given body of text, by making the size of each word proportional to its frequency.
writing - [quotes]
(*) Data Science
(**) Data Management
(***) Management
Resources:
[1]

No comments:
Post a Comment