Showing posts with label MySQL. Show all posts
Showing posts with label MySQL. Show all posts

06 November 2016

SQL Server Administration: System.OutOfMemoryException in SQL Server Management Studio and other 32-bit Drawbacks

    I was playing this week with a few datasets downloaded from the web on various topics, trying to torture the data until they’ll confess something. A few of the datasets were prepared for load into a MySQL database as individual INSERT INTO statements. They were containing between 100000 and a few millions of records. While looking at the big but slim datasets in SSMS (SQL Server Management Studio) and reconciling the differences between MySQL and SQL Server I got several times the System.OutOfMemoryException exception, SSMS crashing one or two times. That should be ok, given the number of records, though I was surprised that I got the same error message while executing the INSERT INTO statements for one of the smallest datasets which had about 300000 records:

    „An error occurred while executing batch. Error message is: Exception of type 'System.OutOfMemoryException' was thrown”

    Kb 2874903 brings some light into the topic – SSMS is still a 32-bit process and thus limited to 2GB of memory. The Kb offers three methods to avoid this issue. The first two, outputting the query results to text or file didn’t worked. The third method based on using sqlcmd utility worked smoothly with a syntax like the one below:
sqlcmd -i “<file_name.sql>” -d “<database name>”

    So it doesn’t matter that you’re having a supercomputer and that working with big datasets becomes a necessity nowadays, this limitation can make data loading just a little bit more complicated. On one side, it’s true that when dealing with such datasets is probably recommended to use directly sqlcmd to execute the scripts. On the other side, independently from this type of problem, even if understandable from the need of keeping backwards compatibility with 32-bit platforms/solutions, it’s hard to digest the fact that Microsoft keeps some of its products 32-bit based when SQL Server is targeting 64-bit platforms. One has same problem when using BIDS (Business Intelligence Development Studio), developing SSRS, SSIS or SSAS solutions under 32-bit and having maybe to deploy the code as 64-bit (e.g. SQL Server Agent). From my point of view most of the issues I had were when dealing with proprietary drivers like the ones for Oracle or even for MS Office. In addition in SSIS there could be features that are only available in 32-bit versions, or have limitations on 64-bit computers (see [5]). As it seems also the SQL Server Data Tools (SSDT) will have similar drawbacks…

   Anyway, sqlcmd utility saved the day with a minimum of overhead. Unfortunately it’s not always that easy to solvethe compatibility issues between 32-bit and 64-bit software and platforms.

Update 20.06.2017:
   One can synchronize the runtime version between BIDS and SQL Server Agent pretty easy. In BIDS under "Configuration Properties/Debug Option" at Project level, there is the “Run64BitRuntime” Property. Set to false it will run your package on 32-bit version. In a SQL Server Agent Package, there is the “Use 32 bit runtime” Checkbox under “Execution options” at step level. Checking this checkbox will run your package on 32-bit version.

   A hint that the two values might be out of synch is the following error message raised when running the package:
"Attempt to load Oracle client libraries threw BadImageFormatException. This problem will occur when running in 64 bit mode with the 32 bit Oracle client components installed."
Resources:
[1] Microsoft Support (2013) Kb 2874903: "System.OutOfMemoryException" exception when you execute a query in SQL Server Management Studio https://support.microsoft.com/en-us/kb/2874903
[2] MSDN (2016) SQL Server 2016: sqlcmd Utility https://msdn.microsoft.com/en-us/library/ms162773.aspx
[3] MSDN (2016) SQL Server 2016: Use the sqlcmd Utility https://msdn.microsoft.com/en-us/library/ms180944.aspx
[4] MSDN (2012) Introducing Business Intelligence Development Studio  https://msdn.microsoft.com/en-us/library/ms173767.aspx
[5] SQL Server 2008 R2: 64 bit Considerations for Integration Services https://technet.microsoft.com/en-us/library/ms141766(v=sql.105).aspx





01 January 2011

Book Review: Lynn Beighley's "Head First SQL" (2011)

I love the Head First O’Reilly book series which besides the book on SQL, has acquired over its short existence several other titles like the ones on Data Analysis, Statistics and Excel, topics strongly correlated with the thematic of the current blog, but also related tangential topics like Project Management, Design Patterns, Object-Oriented Analysis and Design, Software Development, Web Design, Programming, PHP & MySQL, HTML with CSS & XHTML, C#, Java, AJAX, JavaScript, Pyhton, Servlets and JSP, EJB or Rails. (The book series contains more titles which seems to be available also at Head First Labs together with other supporting information.) 

What I like about this book series is that approaches the themes from a basic level, using a rich set of "brain-friendly" visual representations, conversational and personalized style, challenging food for thought, plus some advices on how to use book’s format, facts that allow the reader to make most of the learning process. In fact the books are approachable for any newbies in IT branch, so I recommend the whole set of books to beginners as well to experienced programmers, considering the fact that everybody has something to learn, at least when considering the presentational format of the book.

Coming back to the current title, Head First SQL, it covers almost everything somebody has to know about database design, DML and DDL statements, database objects and security in a nutshell. It focuses only on the SQL Standard (I wish I know which one as there are several versions/revisions, see SQL 1999), so it could be used for all type of RDBMS that implements the standard, even if the book is built around MySQL. A consequence of this aspect is that any vendor specific flavors won’t be found in here, so the book should be used in combination with the documentation or the books that target the vendor-specific functionality. There are also several topics not covered but mentioned roughly: reserved words and special characters, temporary tables, data conversion, some numeric functions and operators, indexes, etc.

I must remark that some of the definitions are somehow ambiguous, and I feel that in some cases additional clarifications are necessary. Some examples in this direction could be considered the very first important definitions, the ones of the database seen as "a container that holds tables and SQL structures related to those tables", the one of column seen as piece of data stored by a table, or of a row seen as a single set of columns that describe attributes of a single thing. The first definition is quite acceptable if it’s ignored the existence of non-relational databases (it’s also true that the book targets RDBMS even if not evident), however the other two definitions are too vague, at least from my point of view. 

It’s also true that more rigor and detail would involve the introduction of other concepts (e.g. entity, matrix, etc.), requiring more space and maybe complicate the topics. The reader could intuit from explanatory text and examples what’s all about, however expecting the reader to bridge such gaps should be avoided especially when we consider the intrinsic nature of this book in particular, and IT books in general. Anyway, this shouldn’t stop you discover and enjoy the book!

Notes:
1. As I learned on the hard way, not all book formats and writing styles are matching everybody’s taste and expectations. The simplistic and rich visually format of the book could annoy as well, so might be a good idea to browse first the book on Scribd before buying it.
2. The source code is available here.

15 August 2010

Data Security: SQL Injection I - Introduction

Introduction

  If you are working in IT, most probably you’ve heard already about SQL Injection, if not then might be it’s a good idea to ask your colleagues and eventually your IT manager if your company has any policies related to it. If you are working for a software vendor or a consultancy company then SQL Injection countermeasure techniques might be quite well positioned in the list of best practices in what concerns the development of Web/Desktop Applications, Web Services or database-related logic adopted by your company. If you are working for a company, other than the two mentioned above, and have various software projects on the role or already in house, then most probably you’ll have to ask if the software vendors you are working with have took into consideration the SQL Injection threats and proved their solutions against them. On contrary, if you have nothing to do with IT at all, it might still be a good idea to ask your IT department if they have anything in place related to SQL Injection – Security Policy, security best practices, etc.

Definition

  Wikipedia defines SQL Injection as “a code injection technique that exploits a security vulnerability occurring in the database layer of an application” [3], the code injection being defined as “the exploitation of a computer bug that is caused by processing invalid data” [4]. For a programmer the definition is acceptable, though for other type of professionals it might not be so clear what’s about, especially when they are not familiar with IT terminology. I find more clear the definition provided by J. Clarke et. al, who in his book SQL Injection Attacks and Defense, defines SQL Injection as the vulnerability that results when you give an attacker the ability to influence the Structured Query Language (SQL) queries that an application passes to a back-end database” [2]. I will slightly modify the last definition and say that the SQL injection is a security vulnerability residing in the possibility to alter the intended behavior of the SQL Queries passed to the database.

Some Background

    At the beginning of our century, with the increase importance of Web Applications whose availability over WAN/Internet (networks) brought new security issues, the SQL Injection became a really hot topic given the damages such techniques could do to an application, with just a few tricks the “hacker” having the possibility to enter in the application and even in the machine hosting the database used, entering thus in the possession of sensitive information, and above all having the possibility of damaging the database. J. Clarke et. al remarks that the first connection between web applications and SQL injection is widely accredited to Rain Forest Puppy, who in an article titled “NT Web Technology Vulnerabilities” (see “ODBC and MS SQL server 6.5” section) written in 1998 for Phrack, an e-zine written by and for hackers [2], was describing the behavior specific to SQL Injection in relation to MS SQL Server 6.5. 
 
    I remember when my boss break us the news that we have to protect urgently our applications against SQL Injection, having to redesign some of the database objects and components in order to protect our applications against such techniques. I was then in my first or second year of professional experience, so the topic was new and quite intriguing not only for myself but also for my colleagues, some of them having a few more years of professional programming experience that I did and, I hope I’m not mistaking, none (or few) of them actually have heard about it. It was interesting to check how simple techniques could do so much damage. At those times there were few articles on SQL Injection and specific countermeasure techniques, not to mention best practices, so we were kind of groping in the dark in finding a countermeasure to the problem.

State of Art

    Since then, the number of search engines hits on the topic is quite impressive, many professionals approaching the problem in their way, Vendors started to design their solutions and make aware programmers on best practices in order to minimize this type of security threat, books were written on this topic, the awareness increased between developers and other type of IT professionals. Even if considerable effort has been made into this direction, and the topic appears often on the blogs, there are still many web sites not designed to address SQL Injection concerns. In 2007, The WhiteHat Security, placed SQL Injection on 5th position in top of vulnerabilities, estimating that 1 out of 5 web sites is vulnerable to SQL Injection [1]. In 8th Web Security Report based on 2009 data provided by WhiteHat Security[5], and as it seems also in 9th report [6], SQL Injection remains on the same position, what’s interesting to remark is the split per scripting technology provided in [6]: 
 
SQL Injection - Statistics WhiteHat

   In Web Hacking Incident Database maintained by Web Application Security Consortium, SQL Injection is considered as 17.97 % out of the total 512 reported top attack methods. Even if the number of reported attacks is insignificant in report to the number of sites available on the web, the percentage of cases seems to be in agreement with the number provided in WhiteHat Security reports.

Resources

   If the topic made you curious, you could find out more with just a simple search on the Web. There are many professionals who wrote on this topic, however it’s a good idea to start directly with the resources provided by the RDBMS vendors, for example Microsoft through its Security Research & Defense blog, in SQL Injection Attacks post has an interesting list of resources on this topic. A nice document on ‘How to write SQL injection proof PL/SQL’ comes from Oracle, an interesting presentation on ‘SQL Injection Myths and Fallacies’ was made at MySQL Conference & Expo, etc.

References:
[1] WhiteHat Security. [2007]. Website Security Statistics Report. [Online] Available from: http://www.whitehatsec.com/home/assets/WPStatsreport_100107.pdf (Accessed: 15 August 2010)
[2] J.Clarke et. al (2009). SQL Injection Attacks and Defense. Elsevier. ISBN: 978-1-59749-424-3
[3] Wikipedia. (2010). SQL Injection. [Online] Available from: http://en.wikipedia.org/wiki/Sql_injection (Accessed: 15 August 2010)
[4] Wikipedia. (2010). Code Injection. [Online] Available from: http://en.wikipedia.org/wiki/Code_injection (Accessed: 15 August 2010)
[5] WhiteHat Security. [2009]. Website Security Statistic Report, 8th Ed. [Online] Available from: http://www.whitehatsec.com/home/assets/WPstats_fall09_8th.pdf (Accessed: 15 August 2010)
[6] WhiteHat Security. [2010]. Website Security Statistic Report, 9th Ed. [Online] Available from: http://www.slideshare.net/jeremiahgrossman/whitehat-security-9th-website-security-statistics-report-3995771 (Accessed: 15 August 2010)

17 April 2010

Software Engineering: SQL Reloaded (Part I: Learning SQL)

Software Engineering Series


Introduction

One of the topics that appears in many database related groups or forums is from where and how new comers should start to learn SQL (Structured Query Language) in general, and specific vendor SQL features in particular. Two years back, in Learning something new post from my programming-grasps blog attempt, I wrote down my thoughts on how to approach learning new subjects in general, many of the respective observations could be applied to SQL learning too, and in addition could be added also domain-specific recommendations. 

I would say that when discussing a topic like learning it makes sense to approach the subject from users level of knowledge and skill-set, different recommendations could apply for beginners, intermediate and expert level. No, also when you are an expert the learning process doesn’t stops there as SQL and RDBMS (Relational Database Management System) evolve with time, new features and techniques being added, same as new contexts of applicability could emerge (e.g. very large databases, geographic information systems, Web-related SQL, etc.).

The Beginner Level

Approaching a topic from beginner’s perspective is never an easy task given the fact that many new comers don’t know from where to start, what to learn first, what RDBMS to target. I was and I’m still at that point as new subjects appear on my learning schedule. One important decision is what RDBMS solution to choose, and for this you might need to consider several aspects: market share, goals/objectives, functional area (e.g. database administration, database development, data warehousing, business intelligence, ETL, data quality), personal preferences, salaries and jobs postings, UI, features, tools, learning curve, performance, programmability, extensibility, integration and compatibilities with other technologies, availability of documentation, books, tutorials and other learning material, etc.

I hope I’m not annoying anybody when I’m saying that Oracle, Microsoft and IBM are the leaders in RDBMS and related data technologies, this being also the ranking in what concerns the market share and job postings (see for example B. Nicolich’s SQL Server Job Trends article). If you aren’t having any preferences I would recommend you to focus on at least one of the RDBMS of the mentioned vendors: Oracle, SQL Server and MySQL. Unfortunately, given researches’ subjectivity, I can’t delimitate which of the three has better performance or is better to use, being important the context in which the studies are made, the overall architecture and their particularities, etc. 

Having a Microsoft technologies background I like SQL Server’s easy to use UI and integrated solution approach it provides, on the other hand Oracle seems to provide more SQL-related features (e.g. functions), while MySQL is still trying to catch the other two vendors and become more mature. All three vendors implement SQL ANSI 92 standard, so learning SQL on one of the platforms would allow you to write queries also on the other platforms, the difference residing in the functions used, DML (Data Manipulation Language) or DDL (Data Definition Language) syntax and specific functionality, architecture and tools.

   For SQL Server and Oracle you could download the trial versions, or opt for the Express versions that could be used free but with certain constraints, while for MySQL you could download one of the free community editions, quite stable solutions. If SQL Server comes with a nice Management Studio that could be used to write queries and administrate your databases, for Oracle and MySQL you might have to consider tools like Toad for OracleOracle or SQL Developer, respectively Toad for MySQL, that allow to be more productive on the respective platforms. If you are intending to use a RDBMS more for personal use (e.g. data analysis), what I call a personal database, then MS Access and MS Excel could prove to be enough, especially when is needed to develop personal or small scale solutions in an organization with a limited knowledge on SQL, though a more evolved RDBMS (actually MS Excel is not a RDBMS but it could be used as a data source or data analysis tool) is recommended when providing solution for multiple users.

All three vendors provide rich documentation, with a plus for SQL Server and Oracle, through the online MSDN or Technet, respectively Oracle Database Documentation Library, and MySQL Reference Manual, the knowledge provided by the documentation being complemented by the many published books, magazines, tutorials, blogs, webcasts, podcasts, studies, best practices and other type of technical literature, many of them easily retrievable with just a simple search using your favorite search engine, important being to know what you are looking for. You have to consider that the technical documentation focuses more on the features, syntax and technical details, while for specific purposes like development, architecture integration or problem solving you might have to consider the other sources of knowledge with a plus for books, tutorials and blogs, the later two being actually more up-to-date than the books, given their shorter “time to market” and informal character.  

If your interest in SQL and databases derives from pure educational purposes then you could also consider the free online course materials (presentations, video lectures) posted within initiatives like the ones of YouTube EDU or MIT Open CourseWare, many other lecture notes being available online from various well-known universities. Other important source of knowledge that gains field in the past years is the one of social and professional networks centered around the important technologies, so joining such a network could bring some benefit but you don’t expect miracles!

In what concerns the books maybe it makes sense to list at least my preferences given the huge volume of books on this topic. I’m a fan of “for Dummies” series, therefore I find Allen Taylor’s SQL for Dummies and its more extensive version SQL All-In-One Desk Reference for Dummies or Database Development for Dummies as a good starting point. If you are targeting SQL Server then you might need to check also Microsoft SQL Server 2008 For Dummies, Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies, Microsoft SQL Server 2005 Programming For Dummies or Microsoft SQL Server 2005 Express Edition For Dummies. In exchange the list of choices for Oracle or MySQL is quite smaller, as far I know there are only two books on the market on Oracle - Oracle PL/SQL For Dummies and Oracle 11g For Dummies, while for MySQL the books from “for Dummies” cycle target the use of MySQL in combination with other scripting programming languages: Apache, MySQL, and PHP Web Development All-in-One Desk Reference, PHP & MySQL For Dummies or PHP & MySQL Everyday Apps For Dummies. Another popular series are the “for Beginners” or “Beginning” series, having a more narrow focus, for example Beginning T-SQL with Microsoft SQL Server 2005 and 2008 or Beginning PL/SQL: From Novice to Professional could be two of the books you could start with.

Choosing the right book is highly dependent on what you want to achieve, your learning style and author preferences, books’ availability and costs, in general each professional having his/her own list of preferences. Years back I was quite stressed because many of the examples written in books were not working, nowadays hopefully that changed positively, in addition many publishers making the code available through books’ supporting sites or publishers websites together with a few sample chapters from the respective books. If you don’t want to buy the whole book or you’d like to read it online, you could become member of online libraries like Safari Books Online, Skillsoft’s Books 24x7, eBrary or SpringerLink. Many books are starting to become available for purchasing in PDF directly from publishers or in proprietary format on digital book readers like Amazon’s Kindle or Sony’s Reader, the digital books being in theory with 10-30% cheaper than the paperback products.

The Intermediate Level

The intermediate level is a relative point in the evolution of a professional, in theory this level being reached after several projects or prolonged use of SQL in various scenarios. During this stage many developers are starting to approach topics like performance or best practices, going thus above the basics, or interest themselves in taking a certification, attempting thus to approach the whole range of topics related to databases. Some of the books listed above qualify as well to the intermediate or expert levels, it depends also on books’ level of detail and approach on the given topics. The books’ series closer to this level are the ones for Professionals, for example Pro SQL Server 2008 Relational Database Design and Implementation, Pro T-SQL 2008 Programmer’s Guide, Professional Oracle Programming, Pro MySQL or Professional MySQL.

For those targeting the MCTS certification, it makes sense to consider the Self-Paced Training Kit Exams: (
Exam 70-432) Microsoft SQL Server 2008-Implementation and Maintenance, (Exam 70-433) Microsoft SQL Server 2008-Database Development, while for those approaching the MCITP certification then MCITP SQL Server 2005 Database Developer All-in-One Exam Guide is quite useful. As I haven’t approached the Oracle or MySQL certifications, I think it’s more indicated to do some research by yourself. 

The Expert Level

At this level developers are starting to go deeper into the database internals, learning how things are handled in the background, differentiate between multiple solutions and choose the optimum solution. Books like Microsoft SQL Server 2008 Internals, SQL Server 2008 Query Performance Tuning Distilled or Professional SQL Server 2008 Internals and Troubleshooting shouldn’t miss from a developer’s library! I would expect there are similar books also for Oracle and MySQL, but as I’m not aware of them so I will stop here.

At this level a developer could also approach the database topics from a higher level, making an incursion in the mathematical/conceptual theory of relational databases, books like Information Modeling and Relational Databases or Applied Mathematics for Database Professionals could become really handy in understanding such topics.

Note:

Most probably there are even better books out there though, as I highlighted above, these are my recommendations based on the books that have fallen in my hand along the years. In many cases, instead of buying a book, I’m looking on whether I could find on the web the information I need, though the quality of the material varies, sometimes same as the quality of the books. There are even cases when the information are contradictory, however in most of the cases I would go with the experts that have the information from the source. 

16 March 2009

DBMS: Query Plan (Definitions)

"The ordered set of steps required to carry out a query, complete with the access methods chosen for each table." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A portion of a DBMS that determines the most efficient sequence of relational algebra operations to use to satisfy a query." (Jan L Harrington, "Relational Database Design and Implementation" 3rd Ed., 2009)

"The plan produced by an optimizer for processing a query." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"A query plan is a sequence of logical and physical operators and data flows that the SQL query optimizer returns for use by the query processor to retrieve or modify data." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"Once the query optimizer determines the best way to execute a query, it creates a query plan. This identifies all the elements of the query, including what indexes are used, what types of joins are employed, and more." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A sequence of logical and physical operators and data flows that the SQL query optimizer returns for use by the query processor to retrieve or modify data." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide" 4th Ed., 2015)

[adaptive query plan:] "An execution plan that changes after optimization because run-time conditions indicate that optimizer estimates are inaccurate. An adaptive query plan has different built-in plan options. During the first execution, before a specific subplan becomes active, the optimizer makes a final decision about which option to use. The optimizer bases its choice on observations made during the execution up to this point. Thus, an adaptive query plan enables the final plan for a statement to differ from the default plan." (Oracle)

[default plan:] "For an adaptive plan, the execution plan initially chosen by the optimizer using the statistics from the data dictionary. The default plan can differ from the final plan." (Oracle)

[execution plan:] "The combination of steps used by the database to execute a SQL statement. Each step either retrieves rows of data physically from the database or prepares them for the session issuing the statement." (Oracle)

[query execution plan:] "The set of decisions made by the optimizer about how to perform a query most efficiently, including which index or indexes to use, and the order in which to join tables." (MySQL)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.