15 August 2011

📈Graphical Representation: Causal Loop Diagrams (Definition)

"One of the tools of systems thinking. Causal loop diagrams capture how variables in a system are interrelated. A CLD takes the form of a closed loop that depicts cause-and-effect linkages." (Virginia Anderson & Lauren Johnson, "Systems Thinking Basics: From Concepts to Casual Loops", 1997)

"A CLD is a graphic model of some of the key system variables connected by arrows that denote the causal influences among the variables. Each arrowhead is identified as either positive (+) or negative (-) to indicate how the dependent variable changes when the independent variable changes." (Daniel D Burke, System Dynamics-based Computer Simulations and Evaluation, 2006)

"A CLD is diagrammatic tool used to describe the causal relationship between key quantities and to identify feedback mechanisms." (Dina Neiger & Leonid Churilov, "Integration of Diagrammatic Business Modeling Tools", 2008)

"A network of actuators connected together is called a causal loop diagram. A causal loop diagram shows how potential business actions lead to complex dynamic effects." (David M Bridgeland & Ron Zahavi, "Business Modeling: A Practical Guide to Realizing Business Value", 2009)

"A tool that captures the causal interrelationships amongst a set of variables. CLDs reveal systemic patterns underlying complex relationships and highlight hidden causes and unintended consequences." (Kambiz E Maani, "Systems Thinking and the Internet from Independence to Interdependence", 2009)

"Causal loop diagramming is a form of cause-and-effect modeling. The diagrams represent systems and their behaviors as a collection of nodes and links. Nodes represent the things in a system, and links illustrate interactions and influences." (Olivia Parr Rudd, "Business Intelligence Success Factors: Tools for Aligning Your Business in the Global Economy", 2009)

"Causal loop diagrams (CLDs) are a kind of systems thinking tool. These diagrams consist of arrows connecting variables (things that change over time) in a way that shows how one variable affects another." (Raed M Al-Qirem & Saad G Yaseen, "Modelling a Small Firm in Jordan Using System Dynamics", 2010)

"A visual representation of a system's feedback loops, where positive loops cycle clockwise, and negative loops cycle counter-clockwise." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"In systems thinking terms, causal loop diagrams are simplified ways to describe essential elements and relationships in a system. These diagrams include curved causal-link arrows (depicting influence from cause to effect) and the polarity of that linkage. An 's' indicates that cause and effect move in the same direction and an  'o' shows that they move in opposite directions (e.g., when cause increases, effect decreases below what it would have been). Causal-link arrows combine into balancing (B) and reinforcing (R) feedback loops. Significant lags between an action and the effects of that action appear as 'delay' on the causal-link arrows." (Karen L Higgins, "Economic Growth and Sustainability: Systems Thinking for a Complex World", 2015)

"A causal loop diagram (CLD) is a causal diagram that aids in visualizing how a number of variables in a system are interrelated and drive cause-and-effect processes. The diagram consists of a set of nodes and edges. Nodes represent the variables, and edges are the links that represent a connection or a relation between the two variables." (Andreas F Vermeulen, "Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets", 2018)

"Methodology to build conceptual or simulation models depicting the causal structure of a complex system." (Francesca Costanza & Pietro Fontana, "Distributing Mutual Advantages in Italian Cooperatives: An Analysis of Patronage Refunds", 2019)

"In system dynamics modelling, they are closed causal chains involving relevant variables, whose interactions are responsible for the patterns of behavior taking place within a certain system." (Francesca Costanza, "Managing Patients' Organizations to Improve Healthcare: Emerging Research and Opportunities", 2020)

14 August 2011

📈Graphical Representation: Pareto Chart/Diagram (Definitions)

"Fundamental tool for determining which characteristic is causing problems in a given process. Constructed by categorizing data, ranking, and plotting frequency of occurrence in bar-chart form in descending order along the x axis. Sometimes dollars are plotted on the y axis to emphasize the cost factor." (Alan Wa Steiss, "Strategic Management for Public and Nonprofit Organizations", 2003)

"A graphical tool for ranking causes from most significant to least significant." (Sohail Anwar, "Quality Management and Control", 2009)

"A chart showing both bars and a line, where the line shows the cumulative total of the individual bars going left to right." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A histogram, ordered by frequency of occurrence, that shows how many results were generated by each identified cause." (Cynthia Stackpole, "PMP® Certification All-in-One For Dummies", 2011)

"A method of displaying data values over time and classification" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"A basic Pareto chart analyzes the unique values of a process variable, which are referred to as Pareto categories or levels. These values typically represent problems encountered during some phase of a manufacturing or service activity." (SAS)

"A Pareto chart is a bar graph. The lengths of the bars represent frequency or cost (time or money), and are arranged with longest bars on the left and the shortest to the right. In this way the chart visually depicts which situations are more significant." (ASQ) [source]

[Pareto analysis] "A statistical technique in decision making that is used for selection of a limited number of factors that produce significant overall effect. In terms of quality improvement, a large majority of problems (80%) are produced by a few key causes (20%)." (IQBBA)

📈Graphical Representation: Data Flow Diagram (Definitions)

"A diagram that shows the data flows in an organization, including sources of data, where data are stored, and processes that transform data." (Jan L Harrington, "Relational Database Dessign: Clearly Explained" 2nd Ed., 2002)

"A diagram of the data flow from sources through processes and files to users. A source or user is represented by a square; a data file is represented by rectangles with missing righthand edges; a process is represented by a circle or rounded rectangle; and a data flow is represented by an arrow." (Jens Mende, "Data Flow Diagram Use to Plan Empirical Research Projects", 2009)

"A diagram used in functional analysis which specifies the functions of the system, the inputs/outputs from/to external (user) entities, and the data being retrieved from or updating data stores. There are well-defined rules for specifying correct DFDs, as well as for creating hierarchies of interrelated DFDs." (Peretz Shoval & Judith Kabeli, "Functional and Object-Oriented Methodology for Analysis and Design", 2009)

[Control Data Flow Graph (CDFG):] " Represents the control flow and the data dependencies in a program." (Alexander Dreweke et al, "Text Mining in Program Code", 2009)

"A graphic method for documenting the flow of data within an organization." (Jan L Harrington, "Relational Database Design and Implementation: Clearly explained" 3rd Ed., 2009)

"A graphic representation of the interactions between different processes in an organization in terms of data flow communications among them. This may be a physical data flow diagram that describes processes and flows in terms of the mechanisms involved, a logical data flow diagram that is without any representation of the mechansm, or an essential data flow diagram that is a logical data flow diagram organized in terms of the processes that respond to each external event." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"Data-flow diagrams (DFDs) are system models that show a functional perspective where each transformation represents a single function or process. DFDs are used to show how data flows through a sequence of processing steps." (Ian Sommerville, "Software Engineering" 9th Ed., 2011)

"A model of the system that shows the system’s processes, the data that flow between them (hence the name), and the data stores used by the processes. The data flow diagram shows the system as a network of processes, and is thought to be the most easily recognized of all the analysis models." (James Robertson et al, "Complete Systems Analysis: The Workbook, the Textbook, the Answers", 2013)

"A picture of the movement of data between external entities and the processes and data stores within a system." (Jeffrey A Hoffer et al, "Modern Systems Analysis and Design" 7th Ed., 2014)

"A schematic indicating the direction of the movement of data" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"A Data Flow Diagram (DFD) is a graphical representation of the 'flow' of data through an information system, modeling its process aspects. Often it is a preliminary step used to create an overview of the system that can later be elaborated." (Henrikvon Scheel et al, "Process Concept Evolution", 2015)

"Data flow maps are tools that graphically represent the results of a comprehensive data assessment to illustrate what information comes into an organization, for what purposes that information is used, and who has access to that information." (James R Kalyvas & Michael R Overly, "Big Data: A Business and Legal Guide", 2015)

"A graphical representation of the logical or conceptual movement of data within an existing or planned system." (George Tillmann, "Usage-Driven Database Design: From Logical Data Modeling through Physical Schmea Definition", 2017)

"a visual depiction using standard symbols and conventions of the sources of, movement of, operations on, and storage of data." (Meredith Zozus, "The Data Book: Collection and Management of Research Data", 2017)

"A data-flow diagram is a way of representing a flow of data through a process or a system (usually an information system). The DFD also provides information about the outputs and inputs of each entity and the process itself." (Wikipedia) [source]

"A graphical representation of the sequence and possible changes of the state of data objects, where the state of an object is any of: creation, usage, or destruction." (IQBBA)

📈Graphical Representation: Cause-Effect Diagram (Definitions)

"A chart that can be used to systematically gather the problem causes of quality defects. Sometimes referred to as the 5M- or 6M-chart because most causes can be related to man (e.g., human factors), machine, method, material, milieu (i.e., the work environment), or the medium (the IT-platform)." (Martin J Eppler, "Managing Information Quality" 2nd Ed., 2006)

"A root cause approach to identifying, exploring, and graphically displaying all possible causes of an issue using a standard quality technique." (Danette McGilvray, "Executing Data Quality Projects", 2008)

"A chart that links an outcome to chains of possible contributing factors as tree structure working backwards from an event to determine possible root causes, drawn sideways so that it resembles the skeleton of a fish. Because the chart resembles the skeleton of a fish, it is often called a fishbone diagram." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A chart associated with a technique for identifying factors linked to potential problems, risks, or effects. Factors can include time, machinery, methods, material, energy, measurements, personnel, and environment. Also known as the Ishikawa diagram or fishbone diagram." (Bonnie Biafore & Teresa Stover, "Your Project Management Coach: Best Practices for Managing Projects in the Real World", 2012)

"A decomposition technique that helps trace an undesirable effect back to its root cause." (For Dummies, "PMP Certification All-in-One For Dummies" 2nd Ed., 2013)

"A model used for identifying cause and effect. Also known as a fishbone diagram and named after its creator, Kaoru Ishikawa." (Sally-Anne Pitt, "Internal Audit Quality", 2014)

"Named after Kaoru Ishikawa, a diagram that shows possible causes of effects that you want to study such as excessive bugs, delays, and other failures in the development process." (Rod Stephens, "Beginning Software Engineering", 2015)

"A diagramming technique, also called the Ishikawa diagramming, which teams can use to identify root causes to a problem, the effects of an action, or the action items they could take to meet a goal." (David K Pham, "From Business Strategy to Information Technology Roadmap", 2016)

"A decomposition technique that helps trace an undesirable effect back to its root cause." (Project Management Institute, "The Standard for Organizational Project Management (OPM)", 2018)

📈Graphical Representation: Sparklines (Definitions)

"A sparkline is a small, intense, simple, word-sized graphic with typographic resolution." (Edward R Tufte, "Beautiful Evidence", 2006)

"A sparkline is a mini-image (thumbnail) of graphical data that enables you to put numbers in a temporal context without the need to display full charts." (Brian Clifton, "Advanced Web Metrics with Google Analytics", 2010)

"A sparkline is a very small chart, typically a line chart, that is drawn without labels on either axis. Its purpose is to show the variation in data, typically over time." (Bruce Johnson, "Professional Visual Studio", 2012)

"Sparklines are condensed graphs or charts that can be used in-line with text or grouped to show trends across several different measures." (Ryan Sleeper, "Practical Tableau: 100 Tips, Tutorials, and Strategies from a Tableau Zen Master", 2018)

"A sparkline is a small line chart with the purpose of showing a general trend, without the full details. A sparkline is very efficient in the amount of screen space it uses." (Andrew Berridge &, Michael Phillips, "TIBCO Spotfire: A Comprehensive Primer", 2019)

"A Sparkline is a tiny chart that appears in a cell and does not include any text data. So, a Sparkline is a great way to give a quick glance of a trend." (Eric Butow, "MCA Microsoft Office Specialist: Office 365 and Office 2019", 2021)

"A sparkline chart is a small, simple, and condensed data visualization tool that presents trends and variations in data over a concise space, typically in the form of a tiny line chart." (Phocas)

"A sparkline is a line chart that displays the variation in a null value, unique value, or non-unique value across the latest five consecutive profile runs." (Informatica)

"A sparkline is a tiny chart in a worksheet cell that provides a visual representation of data. Use sparklines to show trends in a series of values, such as seasonal increases or decreases, economic cycles, or to highlight maximum and minimum values. Position a sparkline near its data for greatest impact." (Microsoft)

"A sparkline is a very small line chart, typically drawn without axes or coordinates. It presents the general shape of a variation (typically over time) in some measurement, such as temperature or stock market price, in a simple and highly condensed way." (Wikipedia)

"Sparklines are small, simple line graphs traditionally used for displaying trends or variations of some variable" (TIBCO)

09 August 2011

📈Graphical Representation: Mind Map (Definitions)

"A visual note-taking process that pares thoughts to key words and pictures illustrating the relationships among concepts." (Ruth C Clark & Chopeta Lyons, "Graphics for Learning", 2004)

"A mind map consists of a central concept which acts as a headline for the map and the branches that represent the aspects of the main concept. A mind map allows summarizing and decomposition of the key aspects of a complex problem or issue." (Hannu Kivijärvi et al, "A Support System for the Strategic Scenario Process", 2008) 

"A mind map is a diagram uses intuition to depict words, ideas or other items in branches around a central key word or idea." (Wan Ng & Ria Hanewald, "Concept Maps as a Tool for Promoting Online Collaborative Learning in Virtual Teams with Pre-Service Teachers", 2010)

[mind mapping:] "A process that brainstorms ideas, words, tasks or other elements and arranges them in groups around a central notion."  (Wan Ng & Ria Hanewald, "Concept Maps as a Tool for Promoting Online Collaborative Learning in Virtual Teams with Pre-Service Teachers", 2010)

[mind-mapping:] "A technique that uses multiple levels of detail for a texture. This technique selects from among the different sizes of an image available, or possibly combines the two nearest sized matches to produce the final fragments used for texturing." (Graham Sellers et al, "OpenGL SuperBible: Comprehensive Tutorial and Reference" 5th Ed., 2010)

"Refers to a technique for the graphical representation of information items, enabling visualization. A mindmap has a radial structure: it is constructed by starting from a central information item, around which other information items are organized like rays from a star, except that each ray can in turn be subdivided in a plurality of finer rays, and so on. The 'rays' are linear, going from an upstream point to downstream, more secondary points, and so on." (Humbert Lesca & Nicolas Lesca, "Weak Signals for Strategic Intelligence: Anticipation Tool for Managers", 2011)

"Powerful techniques you can utilize to increase your comprehension of written materials." (Jeffrey Magee, "The Managerial Leadership Bible", 2015)

[mind-mapping:] "A technique used to consolidate ideas created through individual brainstorming sessions into a single map to reflect commonality and differences in understanding and to generate new ideas." (Project Management Institute, "A Guide to the Project Management Body of Knowledge (PMBOK Guide)", 2017)

[mind mapping:] "A method to brainstorm thoughts while showing relationships of the parts to the whole." (Errick D Farmer et al, "Digital Course Redesign to Increase Student Engagement and Success", 2019)

"A diagram used to represent words, ideas, tasks, or other items linked to and arranged around a central keyword or idea. Mind maps are  used to generate, visualize, structure, and classify ideas, and as an aid in study, organization, problem solving, decision making, and writing." (Software Quality Assurance)

04 August 2011

🔏MS Office: Access vs. LightSwitch - About Starts and Ends of Software Products

Introduction

    When an important software product or technology is released on the market, it brings with it dooming prophecies about the end/death of a competing or related product or technology. Even if maybe it catches the attention, the approach became a stereotype leading to other futile fights between adepts, some food for thought and a pile of words for search engines. As LightSwitch was released recently, people started already sketching dooming plans for competing tools like MS Access, Silverlight, WebMatrix, Visual Studio, etc. It’s actually interesting to study and understand how the entry on the software market impacts the overall landscape, the publishing of more or less pertinent thoughts on the future of a product are more than welcome, though from this to forecasting the end of a software product or technology, at least not without well-grounded reasons, it’s a long way.
    In many cases it’s not even needed to go too deep into the features of the compared software products in order to dismiss such statements, this because there are a few common sense reasons for which the respective products will coexist, at least for the near future. Here are a few of them grouped into technology, products, people, partners and processes. Please note that by the terms old and new (software) products I’m referring here to a product existing on the market for a longer time, respectively a newly entered product.

Technology

    In theory a new software product attempts to take advantage of the latest technological advances in the field, following the trends. Also an old product can take advantage of the latest technological developments, though a certain backward compatibility needs to be maintained, fact that could come with advantages and disadvantages altogether. Considering that nowadays such a product doesn’t exist “per se” but in a complex infrastructure with multiple layers of interconnectivity, a new product has to fit also in the overall picture.
    A product in particular and a technology in general is doomed to extinction when it’s not more able to cope with the trends, when its characteristics don’t satisfy anymore users’ demands or the overhead of using it is greater than its benefits. As long two competing software products are trying to keep up with the trends and consolidate their market, the chances that they will parish are quite small. On the other side, each technology has sooner or later its own end.

Products

    Software products having a few years on the market have reached in theory a certain maturity and stability. New software products typically go through an adoption phase that may last from months to years, and it will take time until they reach a certain maturity and stability, until their market develops, until vendors include them in their portfolio, until other products will develop interfaces to them, etc. First of all it will take some time until the two will come to have the same market share, and secondly it will take even more time until the market share of one of the products will deprecate. In addition, markets embrace diversity and the demands are so various that each product arrives to find his place.
   When the products are coming from the same vendor and they are a part of greater packages and strategies, it’s hard to believe that a vendor would want to blow in the air his own business. Usually the two solutions target different markets, even if their markets intersect. Sure, there are also cases when a vendor might want to strengthen the position of a product in the detriment of another, especially when the benefits are higher.

People

    Often different products demand different skill sets or an upgrade of skill set. For sure not all developers will move from one platform to the other, some will be reticent, while others are declared fans so there is no way to move to something new. Sure, in IT there are frequent the cases when developers have knowledge about 2-3 competing products, though this aspect doesn’t necessarily have a huge impact on the short term. Considering that software products are becoming more and more complex, it’s sometimes even needed a specialization covering only a part of a product.

Partners

    Vendors and Customers, especially existing partners, will most probably approach and evaluate the new product, find a place in their portfolio/solution, conduct some pilot projects and eventually consider the product for further use. We can talk here about an adoption period, corroborated with the appearance of training material, best practices, books or any other material that facilitate the use of such a product. All this time requires time and effort, successful and unsuccessful projects, some years of experience.

Processes

    Organizations have already in place solutions based on a product and integrated with other products. Some of them could be personal solutions, and maybe quite easy to replace, though the replacement of business/enterprise solutions come maybe with important expenses, changes in the infrastructure, and maybe the most important, process changes. And why change something that’s working just for the sake of change?! Sure, if there is the need for a second or third product, this doesn’t (always) mean that all the previous similar products will be replaced. For sure the two or more products can coexist, even if provide similar functionality, and the can maybe complete each other.

Conclusion

    If one product or another will come to its end, for sure only time will tell. Usually when this happens, there are multiple factors that influenced the decay, factors that could be used maybe to foresee such an event. Though, without a detailed analysis or at least some well-supported ideas, dooming declarations about the rise or fall of software products are kind of futile, even if intended to catch readers’ attention. Enthusiastic or contradictory feelings about old or new products are natural, expressing opinions is free and welcomed when there is something to say, though are such declarations really necessary?!

📈Graphical Representation: Histogram (Definitions)

"A graph showing a variable's discrete values or ranges of values on the x-axis and counts or percentages on the y-axis. The number of observations for each value or range is presented as a vertical rectangle whose length is proportionate to the number of observations." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"A graphical display of the frequency distribution of a set of data. Histograms display the shape, dispersion, and central tendency of the distribution of a data set." (Clyde M Creveling, "Six Sigma for Technical Processes: An Overview for R Executives, Technical Leaders, and Engineering Managers", 2006)

"A chart that shows quantities of data points that occur within various numeric ranges." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A special form of bar chart used to describe the central tendency, dispersion, and shape of a statistical distribution." (For Dummies, "PMP Certification All-in-One For Dummies, 2nd Ed.", 2013)

[height-balanced histogram:] "A histogram in which column values are divided into buckets so that each bucket contains approximately the same number of rows." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[top frequency histogram:] "A variation of a frequency histogram that ignores nonpopular values that are statistically insignificant, thus producing a better histogram for highly popular values." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[hybrid histogram:] "An enhanced height-based histogram that stores the exact frequency of each endpoint in the sample, and ensures that a value is never stored in multiple buckets." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[frequency histogram:] "A type of histogram in which each distinct column value corresponds to a single bucket. An analogy is sorting coins: all pennies go in bucket 1, all nickels go in bucket 2, and so on." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

"A chart that shows a frequency distribution." (E C Nelson & Stephen L Nelson, "Excel Data Analysis For Dummies ", 2015)

"A bar chart that shows the graphical representation of numerical data." (Project Management Institute, "A Guide to the Project Management Body of Knowledge (PMBOK Guide )", 2017)

16 July 2011

💠🛠️SQL Server: Administration [Could not load file or assembly Microsoft.MSXML]

I’m not programming anymore as I used to do, though from time to time I still fancy some .Net programming. It’s not much, small applications or CLR-based libraries for SQL Server. Quite often, when I return to programming after a long pause it happens that I run into problems, finding that something that was working previously stopped working. During my last attempt I couldn’t load anymore one of the projects I worked on, receiving the following error:

“Could not load file or assembly ‘Microsoft.MSXML, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’ or one of its dependencies. The system cannot find the file specified.”

Same happened when I tried to load other projects. I looked then into GAC in “C:\Windows\assembly” folder and saw no reference to Microsoft.MSXML dll. So I tried to install the msxml6.dll assembly in GAC however, never doing that, I run into another problem. In the meantime I tried to install the Visual Studio 2010 SP1, the MSXML 6.0 and even the Windows .Net SDK. All this without success. After several good hours, I returned to one of the forum posts (here) I run into in a first place. Thomas Sun was pointing that it might be a problem with the Microsoft Document Explorer 20xx. The solution was to reinstall it from  “C:\Program Files\Common Files\microsoft shared\Help 9\Microsoft Document Explorer 2008”. Once I did that everything was back to normal. At least until I’ll run into another issue.

After all this there is still one positive point: I managed to install SP1 and all the goodies it comes with, and I’m thinking here at the support for HMTL5. The downside – several good hours of lost time! I don’t want to think how much time I lost until now trying to solve things that were supposed to work in a first place – probably weeks, months…  That’s part of programmers’ life.

Disclaimer:
As Microsoft changed the whole structure of their support websites, most of the resources become unavailable. Therefore I had to remove the links pointing to the various sources.


18 June 2011

💎SQL Reloaded: Pulling the Strings of SQL Server X (Dynamic Queries)

A dynamic query is a query constructed at runtime, techniques often indispensable in many situations that require a certain flexibility in query’s creation. The creation of a dynamic query is nothing but a set of operations with strings, many of the techniques mentioned before becoming handy. SQL Server provides two functions for the execution of dynamic queries, namely EXECUTE statements (or its shortened form EXEC) and sp_executesql stored procedure. Even if the later it’s more flexible allowing passing parameters from and to the caller and allows reusing executions plans (see Using sp_esecutesql), for the following examples will be used only EXEC. But before let’s look how a static could become dynamic. For this let’s consider the following query based on AdventureWorks database:
 
-- example 1 - simple query   
SELECT *  
FROM Person.Address  
WHERE AddressID = 1 

-- example 2 - query encapsulated in a string: 
EXEC ('SELECT * FROM Person.Address WHERE AddressID = 1') 

-- example 3 - query stored into a string variable      
EXEC ('SELECT * FROM Person.Address WHERE AddressID = 1') 

Supposing that the AddressID is considered as parameter we can write:

-- example 4 - static query     
DECLARE @AddressID int 
SET @AddressID = 1 
SELECT *  
FROM Person.Address  
WHERE AddressID = @AddressID  

-- example 5 - dynamic query  
DECLARE @sql varchar(100) 
DECLARE @AddressID int 
SET @AddressID = 1 
SET @sql = 'SELECT * FROM Person.Address WHERE AddressID = ' + CAST(@AddressID as varchar (10)) 
EXEC (@sql) 

Until here there is no important conceptual difference. What if is needed to pass multiple AddressIDs? We can create a parameter for which expected values, though that’s not a reasonable solution as the number of values can vary. A more elegant solution would be to create a list of values and provided as a string parameter and then concatenate the original query and the string parameter like below. We just need to accommodate the length of the string variable to the expected size of the list of value.
 
-- example 6 (dynamic query) 
DECLARE @sql varchar(100) 
DECLARE @AddressIDs varchar(50) -- supposed parameter 
SET @AddressIDs = '1, 2, 4, 5, 6, 10'  
SET @sql = 'SELECT * FROM Person.Address WHERE AddressID IN (' + @AddressIDs + ')' 
EXEC (@sql) 

There is actually a third solution. As in the previous post on list of values has been introduced the dbo.StringToTable function, the function can be used thus to transform the list in a table:
 
-- example 7 (list of values) 
DECLARE @AddressIDs varchar(50) -- supposed parameter 
SET @AddressIDs = '1,2,4,5,6,10'  
SELECT *  
FROM Person.Address  
WHERE AddressID IN ( 
      SELECT value  
      FROM dbo.StringToTable(@AddressIDs, ',')) 

In the same post was constructed the DoubleList list of values which can be used in a dynamic query in bulk inserts or table-value constructors. The list needs to be slightly modified by replacing the single quote with two single quotes in order to accommodate value’s storage in a string. Considering that there are no integrity constraints on the targeted table, he query for bulk insert can be written as follows:
 
-- example 8 (list of values & bulk insert) 
DECLARE @sql varchar(200) 
DECLARE @AddressTypes varchar(150)  
SET @AddressTypes = '(6,''Archive''), (1,''Billing''), (2,''Home''), (3,''Main Office''), (4,''Primary''), (5,''Shipping'')' 
SET @sql = 'INSERT Person.AddressType (AddressTypeID, Name) VALUES ' + @AddressTypes  
EXEC (@sql) 

  The same technique can be used with a table-value constructor:
 
-- example 9 (list of values & table-value constructor) 
DECLARE @sql varchar(400) 
DECLARE @AddressTypes varchar(150)  
SET @AddressTypes = '(6,''Archive''), (1,''Billing''), (2,''Home''), (3,''Main Office''), (4,''Primary''), (5,''Shipping'')' 
SET @sql = 'SELECT VA.AddressID, VA.AddressTypeID, AT.NAME FROM Purchasing.VendorAddress VA JOIN ( VALUES ' + @AddressTypes + ') AS AT(AddressTypeID, Name) ON VA.AddressTypeID = AT.AddressTypeID' 
EXEC (@sql) 

The above examples are basic, in daily problems such queries can involve multiple parameters and operations. In addition, in the last examples the concatenation step was left out.

 
Happy coding!

17 June 2011

💎SQL Reloaded: Pulling the Strings of SQL Server VII (List of Values)

Introduction

    Lists are one of the basic structures in Mathematics, the term referring to an (ordered) set of elements separated by comma, space or any other delimiter (e.g. “:”, “;”). The elements of a list can be numbers, words, functions, or any other type of objects. In the world of databases, a list is typically formed out of the values of a given column or a given record, however it could span also a combination of rows and records, is such cases two delimiters being needed – one for column and one for row. From here comes probably the denomination of list of values. In a more general accept a list of values could be regarded as a delimited/concatenated subset. Such lists are formed when needed to send the data between the layers of an application or applications, this type of encoding being quite natural. In fact, also the data in a database are stored in similar tabular delimited structure, more complex though.  

    An useful example in which the list of values are quite handy is the passing of multiple values within the parameter of stored procedure or function (see example). This supposes first building the list and then use the values in a dynamic build query (like in the before mentioned example) or by building a table on the fly. We can call the two operations composition, respectively decomposition of list of values.

Composition 

Composition, whether on vertical or horizontal is nothing but a concatenation in which the values alternate with one or more delimiters. Let’s reconsider the concatenation based on the values of a Person.AddressType AdventureWorks table. As the logic for concatenating for one or more attributes is the same, the below example concatenates a list based on a single attribute, namely AddressTypeID in SingleList, respectively two attributes, AddressTypeID and Name.

-- concatenation of values across a table 
;WITH CTE (AddressTypeID, Name, Ranking) 
AS (--preparing the data       
     SELECT AddressTypeID  
     , Name 
     , ROW_NUMBER () OVER(ORDER BY Name) Ranking 
     FROM Person.AddressType 
     -- WHERE ... 
) 
, DAT (SingleList, DoubleList, Ranking) 
AS ( -- concatenating the values 
     SELECT Cast(AddressTypeID as varchar(max)) SingleList 
     , Cast('('+ Cast(AddressTypeID as varchar(10)) + ',''' + Name + ''')' as varchar(max)) DoubleList 
     , Ranking 
     FROM CTE 
     WHERE Ranking = 1 
     UNION ALL 
     SELECT DAT.SingleList + ',' + Cast(CTE.AddressTypeID as varchar(20)) SingleList 
    , Cast(DAT.DoubleList + ', ('+ Cast(CTE.AddressTypeID as varchar(10)) + ',''' + CTE.Name + ''')' as varchar(max)) DoubleList 
    , CTE.Ranking  
     FROM CTE          
       JOIN DAT           
          ON CTE.Ranking = DAT.Ranking + 1       
)  

-- the lists 
SELECT SingleList 
, DoubleList 
FROM DAT 
WHERE Ranking = (SELECT MAX(Ranking) FROM DAT) 

 List of values - concatenation
 

   The second example is based on atypical delimiters, resembling to the structure built for a batch insert or table value constructor-based statement, and as we’ll see later, ideal to be used in a dynamically-built query

Decomposition

Decomposition follows the inverse path, though it’s much easier to exemplify. In fact it’s used the same technique introduced in the last example from the previous post belonging to the same cycle, Subparts of a String, in which a space was used as delimiter. Another example is the dbo.SplitList function which decomposes a string using a loop.


-- decomposition of a string to a table using CTE 
CREATE FUNCTION dbo.StringToTable( 
 @str varchar(500) 
,@Delimiter char(1)) 
RETURNS @Temp TABLE ( 
Id int NOT NULL 
,Value varchar(50)) 
AS 
BEGIN  
     ;WITH CTE (PrevString, Position, Word)  
     AS (  
     SELECT LTrim(RTrim( CASE  
           WHEN CharIndex(@Delimiter, @str)>;0 THEN Right(@str, Len(@str)-CharIndex(@Delimiter, @str))  
           ELSE ''  
      END)) PrevString  
     , 1 Position  
     , LTrim(RTrim(CASE  
           WHEN CharIndex(@Delimiter, @str)>0 THEN LEFT(@str, CharIndex(@Delimiter, @str)-1)  
           ELSE @str  
       END)) Word  
      UNION ALL  
      SELECT LTrim(RTrim(CASE  
            WHEN CharIndex(@Delimiter, PrevString)>0 THEN Right(PrevString, Len(PrevString)-CharIndex(@Delimiter, PrevString))  
             ELSE ''  
       END)) PrevString  
      , Position + 1 Position  
      , LTrim(RTrim(CASE  
           WHEN CharIndex(@Delimiter, PrevString)>0 THEN LEFT(PrevString, CharIndex(@Delimiter, PrevString)-1)  
          ELSE PrevString  
      END)) Word      FROM CTE  
     WHERE Len(PrevString)>0  
    )  
     INSERT @Temp(Id, Value) 
     SELECT Position  
     , Word      FROM CTE  
     OPTION (maxrecursion 100)  
     RETURN 
END    

Here are two examples based on the single list created above and another one based on alphabet:


-- decomposing a list
SELECT Id 
, value 
FROM dbo.StringToTable('6,1,2,3,4,5', ',')     


-- decomposing the "alphabet" 
SELECT Id 
, value 
FROM dbo.StringToTable('a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z', ',') 

List of values - simple decomposition    

List of values - alphabet decomposition
   
Even if the function deals only with a delimiter, it could be used to decompose lists involving multiple delimiters, as long the list is adequately built:


-- decomposing double list 
SELECT Id 
, value 
, Left(value, CHARINDEX(',', value)-1) LeftValue 
, Right(value, len(value)-CHARINDEX(',', value)) RightValue 
FROM dbo.StringToTable('6,Archive;1,Billing;2,Home;3,Main Office;4,Primary;5,Shipping', ';')     

List of values - double decomposition 

 The tables built thus from list of values can be further used in queries when needed to create a table on the fly. It would be interesting maybe to show that the composition and decomposition are inverse functions, however that’s out of scope, at least for current set of posts. 

13 June 2011

🔹SQL Server: Procedure Cache (Definitions)

"An area of SQL Server memory used to hold trees and query plans that have recently been used or are currently in use by SQL Server." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"A temporary storage location for the current, executing version of a specific stored procedure." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The part of the SQL Server memory pool that is used to store execution plans for Transact-SQL batches, stored procedures, and triggers." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A shared, reserved memory area that stores the most recently executed SQL statements or PL/SQL procedures (including triggers and functions). Also called SQL cache." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"The part of the SQL Server memory pool that is used to store execution plans for Transact-SQL batches, stored procedures, and triggers." (Microsoft, "SQL Server 2012 Glossary", 2012)

30 April 2011

🔹SQL Server: Full-Text Index (Definitions)

"The portion of a full-text catalog that stores all of the full-text words and their locations for a given table." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A special index that efficiently tracks the words you're looking for in a table. They help in enabling special searching functions that differ from those used in regular indexes." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"This is an index that can be queried for large amounts of text. This is maintained by the full-text search service." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A full-text index enables advanced text-based searches to be performed against a database table." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"Index that enables advanced text-based searches to be performed against a database table." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide" 4th Ed., 2015)

23 April 2011

🪄SSRS: First Magic Class (Wizarding a Report)

Introduction

    One year back I started a set of posts on SSIS showing how to create a package with ‘SQL Server Import and Export Wizard’ using as source a SQL Server, respectively Oracle database, in a third post showing how to use the Data Flow Task. The “wizarding” thematic was based on the fact the respective posts were showing how to make use of the wizards built within SQL Server and Visual Studio in order to create a basic export package. Therefore the posts were targeting mainly beginners, my intention at that time was to use them as a starting point in showing various objects and techniques. I also intended to start a set of similar posts on SSRS, so here I am, in this first post showing how to use the Report Wizard to create a report in BIDS (SQL Server Business Intelligence Development Studio) using a SQL Server database.

    This short tutorial uses a query based on AdventureWorks2008 sample database and considers that your Reporting Services instance is installed on the same computer with the database. For simplicity, I focused on the minimal information required in order to built a simple report, following to cover some of themes in detail in other posts. The tutorial can be used together with the information provided in MSDN, see Creating a Report Using Report Wizard and Report Layout How-to Topics sections.

Step 1: Create the Query 
 
    Before creating a report of any type or platform, it’s recommended to create and stabilize the query on which the report is based. For simplification let’s use a query based on Sales.vIndividualCustomer view:

-- Customer Addresses 
SELECT SIC.Title  
, SIC.FirstName  
, SIC.LastName  
, SIC.AddressLine1  
, SIC.City  
, SIC.PostalCode  
, SIC.CountryRegionName  
, SIC.PhoneNumber  
, SIC.EmailAddress  
FROM Sales.vIndividualCustomer SIC 

Step 2: Create the Project 
 
   Launch the BIDS from Windows Menu, create a new Project (File/New/Project) by using the “Report Server Project Wizard” Template, give the Project an appropriate Name and other related information. 
 
SSRS Tutorial New Project
   This will open the Report Wizard, and unless you have chosen previously not to show this step, it will appear a tab in which are shown the step that will be performed:
- Select a data source from which to retrieve data
- Design a query to execute against the data source
- Choose the type of report you want to create
- Specify the basic layout of the report
- Specify the formatting for the report
- Select the report type, choose tabular

Step 3: Create/select the Data Source 
 
    Creating a Data Source pointing to the local server isn’t complicated at all, all you have to do is to give your data source a meaningful name (e.g. Adventure Works) and then define connection’s properties by selecting the “Server name” and database’s name, in this case AdventureWorks2008.

SSRS Tutorial Connection Properties 
    Test the connection, just to be sure that everything works. In the end your data source might look like this:
SSRS Tutorial Select Data Source
    You can define your data source as shared by clicking the “Make this a shared data source” checkbox, allowing you thus to reuse the respective data source between several projects.

Step 4: Provide the Query 
 
   Because the query for our report was created beforehand, is enough to copy it in “Query string” textbox. We could have used the “Query Builder” to built the query, though a tool like SSMS (SQL Server Management Studio) can be considered a better choice for query design. As pointed above, it’s recommended to built the query and stabilize its logic before starting the work on the actual report. SSRS Tutorial Design the Query
Step 5: Generate the Report
   
Creating a report supposes choosing a Report Type (Tabular vs. Matrix), the level at which the fields will be displayed (page, group or details), and the Table Style. For this tutorial choose a Tabular type and, as no grouping is needed, in “Design the Table” step choose all the Available fields and drag-and-drop them in “Details”, the bottommost list from “Displayed fields” section. 
 
SSRS Tutorial Design the Table
    In Choose the Table Style go with the first option (e.g. Slate), though it’s up to you which one of the styles you prefer.

Step 6: Deploy the Report 
 
    You can deploy the report if your report server is already configured, however in order to test the report you don’t have to go that far because you can test the report directly in BIDS. So you can go with the actual settings: 
 
SSRS Tutorial Choose the Deployment Location
    In the last step provide a meaningful name for the Report (e.g. Customer Addresses) and here is the report in Design mode: 
 
SSRS Tutorial Design
   By clicking on “Preview” tab you can see how the actual report will look like:
SSRS Tutorial Preview 
   Now I would recommend you to check what objects the wizard has created and what properties are used. Just play with the layout, observe the behavior and what the documentation says. There are plenty of tutorials on the web, so don't be afraid to search further.

09 April 2011

💎SQL Reloaded: Searching a Value within a Database

Some years back, in a Management meeting related to one of the projects I was working on, one of the Managers made a quite interesting statement, something like: "it should be easy to pool in an Excel sheet from the whole database all the information I am interested in". I kind of smiled, because the information he was requiring was spanning within different modules, having different levels of detail and thus requiring several different output datasets in order to answer a simple question like "how is the manufacturing cost varying over time?". After some meetings we figured out that it’s not so easy to pull the data for the respective question, lacking some data which were not available in the respective database. It’s funny how IS (Information Systems) are getting more complex day by day but fail to capture and indirectly answer simple questions, this also because they were not designed to address such questions. Anyway, this post is not about this.

Some days back I spent a couple of hours trying to identify the tables from which the IDs stored in a log where coming from, in the log being saved no reference to the table, just the IDs. Another example of "good" design! While working diligently, I remembered that I could have easily solved the problem if I were using one of the CLR functions created in a post on Data Profiling Using CLR Functions. Unfortunately, because I was working in a production environment with strong security constraints, I wasn’t allowed registering and use any of the respective functions. Still, considering the easiness and flexibility of a the solution I thought that would be interesting to put it into practice. Who knows, maybe someday I will find the opportunity to use it!

Let’s suppose I am searching for an integer ID value within a given schema of AdventureWorks database. So, in theory, I need to do a search on each integer column of all the tables from the respective schema. The tables can be found in a matter of seconds using the following query based on the dbo.ExecuteScalarToInt CLR function introduced in the previous mention post and dbo.vTableColumns view.
 
-- searching a value in whole database 
WITH CTE 
AS (  
    SELECT [Schema_Name]  
    , Table_Name  
    , Column_Name 
    , system_type 
    , dbo.ExecuteScalarToInt('SELECT count(*) FROM ' + Quotename([Schema_Name]) + '.' + Quotename(Table_Name) + ' WHERE ' + Quotename(Column_Name) + ' = ' + Cast(1001 as varchar(20))) NumberRecords 
    FROM dbo.vTableColumns 
    WHERE [Schema_Name]= 'Sales' 
     --AND Table_Name = 'SalesOrderDetail' 
     --AND Column_Name = 'SalesOrderDetailID' 
     AND system_type = 'int' ) 
SELECT * 
FROM CTE 
WHERE NumberRecords>0 
ORDER BY [Schema_Name] 
, Table_Name  

Output:
Schema_Name Table_Name Column_Name system_type NumberRecords
Sales ContactCeditCard CreditCardId int 1
Sales CreditCard CreditCardId int 1
Sales CurrencyRate CurrencyRateId int 1
Sales CustomerAddress AddressId int 1
Sales SalesOrderDetail SalesOrderDetailId int 1
Sales SalesOrderHeader BillToAddressId int 8
Sales SalesOrderHeader CreditCardId int 1
Sales SalesOrderHeader ShipToAddressId int 8

As can be seen the dynamically built query, based on table metadata and searched value, is provided as parameter to dbo.ExecuteScalarToInt function. The query calculates the number of occurrences (see NumberRecords column) of the same value within a given column. Some attention must be given to the search column’s data type, as the data type needs to be compatible with the searched value, otherwise a cast being necessary. I haven’t tried the query in a big database and I would advice you to be cautious when running it in a production environment. In case you arrive to use the query in a big database, please let me know how it behaved!

In case of the problem I had a few days back, the above query it’s only a first step in identifying the tables in which the values occur. Because multiple tables are returned, it’s needed to investigate which of the tables has the highest probability to be the source for the ID saved in log. Of course, some DM (data mining) algorithm could be probably used for this purpose, however some programming is needed and also some amount of diligent work to prove the data. It would be nice to have a tool that performs such analysis or that creates a complex query using any set of attributes within a given database.
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.