Blog
SQL : Tips for getting .Net and MySql to work together
23 Mar 2010 | 0 Comments | Permalink
I recently had the opportunity to build a new Chan-based site against a MySQL database, this article is essentially a collection of all the knowledge I needed to make an ASP .Net/Mono app talk to it, and so isnt exclusive to Chan (Because Chan does its best to get out of your way there's no database handling or binding built in) so hopefully it'll be useful to anyone else wanting to do it, just like the Chan + Mono + openSUSE install guide is also applicable to any ASP.Net app.
Background
Personally I tend to build very basic objects with typed properties to represent rows and List<T>'s of these to represent sets, I then use a basic central static object to open and return a database connection wherever I need it, and to close them when I'm done (Handy Tip 1: Do not Dispose() connections - it kills pooling in .Net). Anyway, what I end up with is a set of generic classes named DataTransformers with simple functions on top for retreiving and saving these objects. One call in such a DataTransformer would look like this:
class Users : DataTransformer<Types.User>
{
public static Types.User ByID(Int64? UserID)
{
return Select(DB.Connection, "SELECT TOP 1 * FROM [Users] WHERE [UserID]=?", new object[] { UserID });
}
}
The Select function (and the List<T> equivalent SelectList) accepts a Connection, SQL statement and array of objects and the base class takes care of building command objects and binding parameters to them. The connection is courtesy of my static connection builder which just spawns, and returns a new instance. When its done it creates an instance of the User class, populates it from the returned data and returns it. This way the code in the rest of the app can deal exclusively with the objects and never need to deal with the database.
For reasons best left unexplained the site in question was prototyped against an Access database and then deployed to MySql when development took proper flight. The migration was simple enough courtesy of the donation-ware app Access To MySql. But the SQL Syntax had to change to accomodate.
Installing the MySql data namespace
First thing to do was download the DLLs needed to talk to MySql from .Net, there's a handy DLL-only package that is also Mono compatible (as is Chan, remember!) at http://dev.mysql.com/downloads/connector/net/, just select ".Net & Mono" from the drop down and unzip the payload into your /bin directory. You then need to reference MySql.Data.dll in your solution (or at the csc command line with /R). You then have a native .Net MySqlClient data namespace.
Connecting to the database
MySql connection strings are, it turns out, the most sensible in the universe. So connecting is so easy your grandma could do it. This connects to the MySql database "bar" on localhost as user "foo":
using MySql.Data.MySqlClient; // ... MySqlConnection conn = new MySqlConnection( "Server=127.0.0.1;Uid=foo;Pwd=12345;Database=bar;" ); conn.Open(); // Do Stuff conn.Close();
Best practise is to wrap this and do it in its entirety for every single database operation, regardless of the db type you are connecting to. Repeatedly spawning, opening and closing objects might normally be seen as totally bonkers, but the connection pool actually handles things behind the scenes and you'll find that the connection isn't really being created, opened, closed and destroyed multiple times. It's quite smart and handles itself very ably, so let it do its job!
MySql Quirks
Quirks is probably a little unfair, as it's essentially a matter of point of view. The simplest changes were to replace the square brackets (supported by Access, MSSQL and SQLite) with backticks:
SELECT TOP 1 * FROM `Users` WHERE `UserID`=?
Next is to swap the TOP [count] clause with the much more useful LIMIT [start,stop], which is also the way that SQLite does things:
SELECT * FROM `Users` WHERE `UserID`=? LIMIT 0,1
Originally my code simply iterated through an array of objects and stuffed them into the connection as parameters in the order arrived, but MySQL throws a fit if the parameters dont have names. My auto-parameter-stuffing code had to change it from something like:
IDbCommand cmd = new MySqlCommand(sql, DB.Connection);
if( null != objects && objects.Length > 0 )
foreach( object o in objects )
{
IDataParameter prm = new MySqlParameter();
cmd.Parameters.Add(prm);
prm.Value = val ?? DBNull.Value;
}
To auto-name each parameter p0, p1, p2 etc as follows:
IDbCommand cmd = new MySqlCommand(sql, DB.Connection);
if( null != objects )
for( Int32 i = 0 ; i < objects.Length ; i++ )
{
IDataParameter prm = new MySqlParameter();
prm.ParameterName = "p" + i.ToString();
cmd.Parameters.Add(prm);
prm.Value = val ?? DBNull.Value;
}
The final tweak was to name the parameters from anonymous question marks to @-named params (like MSSQL) which matched the above scheme:
SELECT * FROM `Users` WHERE `UserID`=@p0 LIMIT 0,1
The final code snippet is below. If you come from an MS world - which is obviously natural for .Net developers - then pretty much everything you need to know from a coding point of view is somewhere above, there's really nothing else to it as the rest is taken care of by the MySql.Data DLL.
class Users : DataTransformer<Types.User>
{
public static Types.User ByID(Int64? UserID)
{
return Select(DB.Connection, "SELECT * FROM `Users` WHERE `UserID`=@p0 LIMIT 0,1", new object[] { UserID });
}
}
SQL : Using Joins To Insert, Update and Delete Data
04 Dec 2009 | 0 Comments | Permalink
For the uninitiated using JOINs in SQL to perform anything other than SELECTs probably seems like some kind of scary voodoo but the reasoning behind the ability is the same: It allows you to filter the data you are working with. This article walks through a made up scenario (similar to a real life one i've been tasked with, no less!) and shows how all three modification types are done with simple JOINS. This is a fairly long article so make sure you have a fresh coffee to hand before you dive in!
The Scenario
Let's say your employer is a widget manufacturer and they source the screws, nuts, bolts and washers from various suppliers. In fact they might get identical parts from different suppliers depending on availability. However their purchase history database was never set up to allow this and the only way of discriminating where a part came from is by looking at it's part number within the order. Let's start with three tables, a Parts table an Orders table and an OrderItems link table:
PartID PartName ------ -------- 1 Screw 2 Nut 3 Bolt OrderID OrderDate ------- ---------- 1 2009-12-01 2 2009-12-02 3 2009-12-03 ItemID OrderID PartID PartNo ------ ------- ------ ------- 1 1 1 BOB1234 2 1 2 BOB5678 3 2 1 898-122 4 2 2 820-111 5 2 3 133-022 6 3 2 22/0010 7 3 3 12/0666
This setup worked fine for a while, but performing any kind of analysis was a night on impossible task, with hardcoded LIKE statments all over the place. once the Items table had reached a million rows and another couple of suppliers were coming on board - meaning recoding all the views and queries, it was apparent something had to change. We had to create a Suppliers table, give each a unique CHAR(5) ID and then split the Parts table up so that each supplier had their own entry for each part they supplied.
SupplierID SupplierName Pattern BasePartID ---------- ---------------- ------- ---------- BOBPT Bob's Parts BOB% 0 JIMMY Jimmy's Emporium ___-___ 1000 DIRCT Direct Fixings __/____ 2000
At this point was decided to denormalise the Parts table - rather than have a seperate link table between parts and suppliers - and have a different row for each part and supplier combo, each with a different PartID, (sigh). Anyway, the Supplier Pattern column allows us to match part numbers in the Items table to each supplier using a LIKE and the BasePartID is used to split the PartIDs up. Both of these are only used during the spilt and in real life were manually stuffed into a secondary temp table.
The Procedure
First we need to get an authoratitive list of PartIDs along with the values to convert them to, we're going to precomupte these and store them in a temp table, rather than include this particular JOIN in all the later queries.
CREATE TABLE #Map
(
OldPartID INTEGER,
NewPartID INTEGER,
SupplierID CHAR(5),
PartNo VARCHAR(8)
)
INSERT INTO #Map( OldPartID, NewPartID, SupplierID, PartNo )
SELECT DISTINCT
ITM.PartID, ITM.PartID + SUP.BasePartID, SUP.SupplierID, ITM.PartNo
FROM OrderItems AS ITM
INNER JOIN Suppliers AS SUP
ON ITM.PartNo LIKE SUP.Pattern
Notice the tasty JOIN...LIKE, this results in a table that matches each Part to each supplier, along with a new PartID. SELECT * FROM #Map yields:
OldPartID NewPartID SupplierID PartNo --------- --------- ---------- -------- 1 1 BOBPT BOB1234 2 2 BOBPT BOB5678 1 1001 JIMMY 898-122 2 1002 JIMMY 820-111 3 1003 JIMMY 133-022 2 2002 DIRCT 22/0010 3 2003 DIRCT 12/0666
UPDATE with JOIN
Now we have a mapping we can go ahead and re-set all the PartIDs in the OrderItems table, we JOIN it to our #Map table and pick out the NewPartIDs to update each to.
UPDATE ITM
SET PartID = MAP.NewPartID
FROM
OrderItems AS ITM
INNER JOIN #Map AS MAP
ON ITM.PartNo = MAP.PartNo
There, that was easy wasnt it. The key part of the UPDATE...JOIN - and this is the same for the DELETE...JOIN - is that the table being modified is referenced not by name but by the alias as specified in the join, here it's my 3-char alias ITM. You need to do it this way because you can occasionally find yourself updating a table using a join back to itself!
As I stated earlier, you could skip the temp table and do this directly, but as we use the same mapping for each of the next set of operations the temp table really does make life easier. Though this operation could easily have been done without it:
UPDATE ITM
SET PartID = PartID + SUP.BasePartID
FROM
OrderItems AS ITM
INNER JOIN Suppliers AS SUP
ON ITM.PartNo LIKE SUP.Pattern
My only caveat here is that i get nervous when doing UPDATEs with direct arithmetic in, I like to write scripts (Transaction or no) where pressing the button to execute more than once will not royally bodge things up, a direct UPDATE to a particular value cures that and also helps when you're doing large operations in batches. This is the new OrderItems table:
ItemID OrderID PartID PartNo ------ ------- ------ ------- 1 1 1 BOB1234 2 1 2 BOB5678 3 2 1001 898-122 4 2 1002 820-111 5 2 1003 133-022 6 3 2002 22/0010 7 3 2003 12/0666
DELETE with JOIN
So now we can map each ordered part to each supplier, first we retool the Parts table to change all the id's around, this is an INSERT...JOIN. We'd need to disable the PartID autoincrement during this, our solution was to SELECT INTO a new table, INSERT into that and then use sp_rename to replace the old table with the modified one. Here's the INSERT:
ALTER TABLE Parts
ADD SupplierID CHAR(5)
INSERT INTO Parts( PartID , SupplierID , PartName )
SELECT MAP.NewPartID , MAP.SupplierID , PRT.PartName
FROM Parts AS PRT
INNER JOIN #Map AS MAP
ON PRT.PartID = MAP.OldPartID
WHERE MAP.NewPartID > 1000
INSERT has no native JOIN syntax but the above is close enough, it creates new entries for all the parts each supplier stocks - or at least, has supplied to us in the history of the database. The > 1000 check is because we are not remapping BOB's parts, in real life there were so many BOB rows (around 75% of the total) that the procedure killed the server and it was decided to leave BOB's IDs intact. What this means, however is that there are still entries in the parts table with no SupplierID and these need updating to belong to BOB. That's a simple UPDATE...WHERE SupplierID=NULL, first though we delete all the parts that BOB does not supply:
DELETE FROM PRT
FROM Parts AS PRT
LEFT OUTER JOIN #Map AS MAP
ON PRT.PartID = MAP.OldPartID
AND MAP.SupplierID='BOBPT'
WHERE MAP.NewPartID IS NULL
AND PRT.PartID < 1000
UPDATE Parts SET SupplierID='BOBPT' WHERE SupplierID IS NULL
You could probably achieve the same with a NOT IN() but the outer join is much more fun. Like UPDATE...JOIN you reference the alias in the DELETE FROM section, then there's another FROM to signal the start of the JOIN. The LEFT OUTER preserves entries in Parts (on the LEFT of the join) that are not in #Map (on the RIGHT of the join). The NULL check is the final filter of what to delete. Finally we need the PRT.PartID < 1000 check so that we dont DELETE all our newly created parts for JIMMY and DIRCT. To break this down we can do a simple SELECT with the same JOIN:
SELECT * FROM Parts AS PRT
LEFT OUTER JOIN #Map AS MAP
ON PRT.PartID = MAP.OldPartID
AND MAP.SupplierID='BOBPT'
WHERE PRT.PartID < 1000
Returns:
PartID SupplierID PartName OldPartID NewPartID SupplierID PartNo ------ ---------- -------- --------- --------- ---------- ------- 1 NULL Screw 1 1 BOBPT BOB1234 2 NULL Nut 1 2 BOBPT BOB5678 3 NULL Bolt NULL NULL NULL NULL
So we can see that filtering these WHERE NewPartID IS NULL will simply return the entry for Bolt. Here is the final Parts table:
PartID SupplierID PartName ------ ---------- -------- 1 BOBPT Screw 2 BOBPT Nut 1001 JIMMY Screw 1002 JIMMY Nut 1003 JIMMY Bolt 2002 DIRCT Nut 2003 DIRCT Bolt
At this point you'd add a couple of CHECK constraints to make sure that each Orders and Parts row contains only valid SupplierIDs and PartIDs. You could also - making the likely assumption that an order only ever involves one supplier at a time - UPDATE the Orders table to contain a SupplierID but we'll leave that as an excersise for the reader :-)
SQL : Using a Conditional Where Clause with an Outer Join
08 Apr 2008 | 0 Comments | Permalink
In a previous article I showed how to filter a result-set in a stored procedure via a switch in the Where clause, this takes that a step farther, you should read that article first before trying to get to grips with this one.
Previously we imagined a schema with two tables, tblWidgets and tblWidgetTests, but suppose you were faced with the same problem but your schema instead only had the following tables: tblWidgets and tblWidgetFailures. This has happened to me in real life so I can't be the only one - there's no assertive flag that states if a Widget is good or not, it's assumed to be good until logged as a failure.
Now, returning to the previous problem, with writing a stored procedure that needs to potentially return three result sets:
- Show all Test Passes
- Show all Test Failures
- Show Both
First let's initiate our stored procedure:
CREATE PROCEDURE dbo.splTestResults
@Pass int = NULL
Now, the return-only-failures case is easy - it's a join to tblWidgetFailures but we'd need an Outer Join to get the (assumed) passes, this is how we do that part
SELECT WID.WidgetID,
WID.WidgetSerialNo
FROM tblWidgets WID
LEFT OUTER JOIN tblWidgetFailures WIF
ON WID.WidgetID = WIF.WidgetID
Here those that have failed will have an entry to the right of the outer join and the assumed-passes will have nulls there instead. This gives us something to work with, using a CASE:
SELECT
WID.WidgetID,
WID.WidgetSerialNo,
Pass = (CASE WHEN WIF.WidgetID IS NULL THEN 1 ELSE 0 END)
FROM tblWidgets WID
LEFT OUTER JOIN tblWidgetFailures WIF
ON WID.WidgetID = WIF.WidgetID
The CASE logic is relatively simple, returning 1 or 0 depending on the absence or presnce of data, respectively. This sort of fun with Outer Joins and its associated logic alone can terrify your co-workers and be used to shutdown evil robots: we're using the presence of a NULL to convey useful information.
Now for that good old conditional where, because Pass is now a derived column we can't use the same logic as the original article, instead we need to mirror the logic used in the CASE, and here it is:
WHERE (@Pass IS NULL
OR (@Pass=1 AND WIF.WidgetID IS NULL)
OR (@Pass=0 AND WIF.WidgetID IS NOT NULL)
)
Again, same logic as the CASE, but a bit more specific:
- If @Pass is 1 then we return those NOT present to the right of the outer join
- If @Pass is 0 then we return those that are
- If @Pass is NULL then return everything
As in the original article the NULL case causes the WHERE to ignore the result set and return everything it finds. The bracket nesting is needed to explicitly lay out each use case but there's still nothing stopping you from adding in extra WHERE clauses after closing the outer brackets (e.g. a Date BETWEEN clause).
Naturally you'd want to add some comments in and around the code for your stored procedure to explain exactly what's happening to the next hacker who's assigned the job, or yourself in two years time.