ASP.NET MVC: Keep Private Settings Out of Source Control

Posted on April 6 2014 06:24 AM by jatten in C#, CodeProject, ASP.Net, ASP.NET MVC   ||   Comments (0)

Locked240It is just too easy to accidentally push confidential information up to a publicly hosted source repository such as Github. Also, when managing a project with multiple developers, it can become messy managing multiple configuration files between team members.

How often do you pull the latest changes down from source control, and then need to reset a database connection string after someone else accidentally pushed their own modified App.config or Web.config file up?

Even when the settings or connection strings are not critically private, this can be a pain.

Image by Rina Pitucci  |  Some Rights Reserved

Consider a typical Web.config file from an ASP.NET MVC web application (non-relevant content removed for clarity):

ASP.NET Web.config File Example:
<?xml version="1.0" encoding="utf-8"?>
<!--
  A bunch of ASP.NET MVC web config stuff goes here . . . 
  -->
<configuration>
  <connectionStrings>
    <add name="DefaultConnection" value="YourConnectionStringAndPassword"/>
  </connectionStrings>
  <appSettings file="PrivateSettings.config">
    <add key="owin:AppStartup" value="AspNetIdentity2ExtendingApplicationUser.Startup,AspNetIdentity2ExtendingApplicationUser" />
    <add key="webpages:Version" value="3.0.0.0" />
    <add key="webpages:Enabled" value="false" />
    <add key="ClientValidationEnabled" value="true" />
    <add key="UnobtrusiveJavaScriptEnabled" value="true" />
    <add key="EMAIL_PASSWORD" value="YourEmailPassword"/>
  </appSettings>
</configuration>

 

In the above, there is a database connection string we likely don't want to push to a public repo, and/or which may differ from developer to developer on a team, even internally if they are working against different or individual development versions of the application database.

Also, there is an email password, likely used to send email from within the application, which also may differ amongst team members during development, and which also should not be published publicly.

At the same time, there is a bunch of other stuff which is global to the application, so keeping the entire Web.config file out of source control is not an attractive option, either.

Fortunately, the .NET ConfigurationManager affords us a couple of handy ways to deal with this.

Use configSource Attribute to move an Entire Configuration Section to Its Own File

We can use the configSource attribute to move an entire Configuration Section to an external file. For example, database connection strings are one of the most common items we need to keep in our App.config or Web.config files, but which we also (usually) don't want to publish to a publicly hosted source control repository.

We can add a separate configuration file named (for example) connectionStrings.config, and then use the configSource attribute within our Web.config file to refer to it. To do so, add a new Web Configuration file, name it ConnectionStrings.config, and then put only the following in the new file (no xml header, nothing but the <connectionStrings> section tags, and the <add> element(s):

ConnectionStrings.config File Example:
<connectionStrings>
  <add name="DefaultConnection" value="YourConnectionStringAndPassword"/>
</connectionStrings>

 

Then, we can modify our original Web.config file, removing the <add> element from the <connectionStrings> section, and instead, using the configSource attribute to refer to the new ConnectionStrings.config file:

Modified Web.config File Using configSource:
<connectionStrings configSource="ConnectionStrings.config">
</connectionStrings>

 

Now, we can still access our connection string the same as always:

Accessing Connection String By Name:
var conn = ConfigurationManager.ConnectionStrings["DefaultConnection"];
string connString = conn.ConnectionString;
// Etc...

 

In the above, accessing the connection string by name like that returns a ConnectionStringSettings object.

When we use the configSource attribute, the Configuration Section to which it is applied can contain no actual elements. The entire section will be referred to from the external file. Note that the configSource attribute can be used in this manner with any Configuration Section.

Use the File Attribute to Move Select Application Settings to an External File

You may have a case, such as our example We.config file above, in which most of the values in the <appSettings> Configuration Section are global to the project, but also include a handful of settings which should remain private, and kept out of source control.

In these cases, there is a special file attribute available specifically to the <appSettings> section which essentially allows us to extend <appSettings> to an external file. In other words, ConfigurationManager will recognize the contents in both locations when referring to <appSettings> and make all transparently available within the application.

In our example case, we have an email password we would like to keep private. We might add another Web Configuration file named PrivateSettings.config. Once again, there should be no XML header. The only thing this file should contain  will be a set of <appSettings> elements, and within those, the special settings we wish to define privately.

Special PrivateSettings.config File Extends AppSettings Section:
<appSettings>
  <add key="MAIL_PASSWORD" value="xspbqmurkjadteck"/>
</appSettings>

 

No, we remove the email password element from Web.config, and add the file attribute to the <appSettings> section element, pointing to the new PrivateSettings.config file:

Add File Attribute to Web.config AppSettings:
<appSettings file="PrivateSettings.config">
  <add key="owin:AppStartup" value="AspNetIdentity2ExtendingApplicationUser.Startup,AspNetIdentity2ExtendingApplicationUser" />
  <add key="webpages:Version" value="3.0.0.0" />
  <add key="webpages:Enabled" value="false" />
  <add key="ClientValidationEnabled" value="true" />
  <add key="UnobtrusiveJavaScriptEnabled" value="true" />
</appSettings>

 

Again, as before we can access any of our settings in the standard manner - externalizing the email password setting to a separate file is transparent to client code:

Accessing Settings:
var pwd = ConfigurationManager.AppSettings["MAIL_PASSWORD"];

 

Add Special Files to .gitignore

Now we can add our Web.config file to source and commit, and add the two special files, ConnectionStrings.config and PrivateSettings.config to our .gitignore file, and commit away. When it's time to push to a shared repo, our private information will stay private.

Documentation is Key

Of course, when we take this type of approach, it will be helpful to other developers if our documentation clearly indicates what is going on here. We might do this in our project README file, and/or add some XML comments at each point in our modified Web.config informing others that they will need to add the proper files to their local version of the project, and what those files should contain.

Additional Resources and Items of Interest

 

Posted on April 6 2014 06:24 AM by jatten     

Comments (0)

Biggy: Evolving the Architecture

Posted on April 2 2014 07:29 AM by jatten in Biggy, C#, Database   ||   Comments (0)

10927753_aebf894396_zRecently, K. Scott Allen proposed a re-thinking of the fundamental architecture for Rob Conery's Biggy project. Over the past month, the project has grown from a simple, flat-file, in-memory JSON store into a more ambitious, LINQ-compliant high-performance document/relational query tool. The existing structure worked, but there are clearly places where it is beginning to become brittle.

Biggy is a high-performance, synchronized in-memory document/relational query tool for .NET. The project attempts to combine the most desirable features of document and relational data stores, along with some ORM-like features.

Image by Ian Kershaw | Some Rights Reserved

As of this writing, the existing implementation is heavily inheritance-driven. K. Scott Allen's solution proposes an interface-centric approach, and more cleanly separates the concerns of data access from the backing store, versus managing the list-based, in-memory representation of the data with domain models. Of the existing architecture, Mr. Allen states, and I agree, that:

Looking over the Biggy implementation, every different data store becomes coupled to an InMemoryList<T> class through inheritance. The coupling isn’t necessarily wrong, but it does complicate the implementation of each new data store.

Having worked rather extensively with the code base on the relational database implementation, I found exactly this to be the case. Not that the current architecture was bad, but simply that the project had grown outward from Rob's early implementations, to the point where what was once simple was becoming increasingly complex.

Hiding Implementation with Interface Abstraction

Under the proposed architecture, Biggy would utilize a few interfaces to take some of the pain out of extending the implementation to support various data stores.

In his post, Scott Allen proposes, first and foremost, separating Lists from stores through interface implementation. The strength of Biggy as a library results from delivering an in-memory list representation of application data as domain objects.

K. Scott’s approach cleanly separates the responsibilities of the Store (fetch/push data to the back-end database) from the in-memory list (query/manipulate data within the domain model), and proposes to abstract the store functionality behind a set of interfaces, IBiggyStore<T> such that that different backing store implementations can be easily ported into the library. Further, the in-memory list implementation will be abstracted behind its own interface, IBiggy<T>, into which an instance of IBiggStore<T> is injected, thereby completing the de-coupling of store from list.

As I undertook to implement the proposed structure, I made a few minor adjustments, finally arriving at the following interface for the in-memory IBiggy<T> abstraction:

The IBiggy<T> Interface and Associated Sub-Classes:
public interface IBiggy<T> : IEnumerable<T>
{
    void Clear();
    int Count();
    T Update(T item);
    T Remove(T item);
    List<T> Remove(List<T> items);
    T Add(T item);
    List<T> Add(List<T> items);
    IQueryable<T> AsQueryable();
  
    event EventHandler<BiggyEventArgs<T>> ItemRemoved;
    event EventHandler<BiggyEventArgs<T>> ItemAdded;
    event EventHandler<BiggyEventArgs<T>> ItemsAdded;
  
    event EventHandler<BiggyEventArgs<T>> Changed;
    event EventHandler<BiggyEventArgs<T>> Loaded;
    event EventHandler<BiggyEventArgs<T>> Saved;
}

 

The primary differences between my implementation above and K. Scott’s proposed structure is the addition of a Remove(List<T>) method to remove a range of items from the list, and changing everything from IEnumerable<T> to List<T>. The reason for this last was that we seemed to have variations between List<T>, IEnumerable<T>, IList<T>, and others scattered about the API. While I can see a case coming to move back to IEnumerable, I was having to do a whole lot of myEnumerable.ToList() and such. We’ll see what happens. For now, I made everything I could List<T>.

 

Abstracting the Backing Store

One of the principle drivers behind the architectural changes was to separate the responsibilities of in-memory list management from those of data transfer to and from the backing store. Per K. Scott’s original proposal, the core IBiggyStore<T> interface is a simple, brute-force affair. Again, I have made some minor modifications which may be reversed  before this reaches production, but for now, the basic Interface looks like this, with the additional IUpdateable and IQueryable variants as well:

The Biggy Store Interfaces:
public interface IBiggyStore<T>
{
    List<T> Load();
    void SaveAll(List<T> items);
    void Clear();     
    T Add(T item);
    List<T> Add(List<T> items);
}
  
public interface IUpdateableBiggyStore<T> : IBiggyStore<T>
{
    T Update(T item);
    T Remove(T item);
    List<T> Remove(List<T> items);
}
  
public interface IQueryableBiggyStore<T> : IBiggyStore<T>
{
    IQueryable<T> AsQueryable();
}

 

In the above, we have abstracted the basic store functionality behind a set of interfaces. This allows us to swap backing stores with ease, while ensuring code which consumes instances of IBiggy will continue to function properly.

In my current implementation, I have defined a base class BiggyRelationalStore which contains code which will be common to any relational database implementation. Then, there are abstract methods which require concrete implementation in platform-specific subclasses.

As we can see, a concrete implementation of IBiggy through BiggyList can call out to the backing store through the various store interface method. As K. Scott says in his post:

"…the implementation of an actual data store doesn’t need to call into a base class or worry about raising events. The store only does what it is told…"

A Single Concrete List Implementation

The injection of the data store as an instance of IBiggyStore<T> allows us to create a single implementation class for the basic BiggyList. BiggyList is, of course, the business end of the Biggy library – its raison d’etre if you will. By injecting an abstract IBiggyStore instance into the constructor of the BiggyList class, we are able to get all of the code which previously managed database platform-specific store interaction out of the BiggyList class and safely stick it behind the IBiggyStore interface.

Sometimes, Abstraction Comes with a Price

On the whole, K. Scott’s new architecture cleaned up the Biggy code base significantly. However, this does not come without a price. Store abstraction and injection now requires the following to initialize a new BiggyList<T> instance:

Initializing a new BiggyList Instance with Store Injection:
// Initialize a Store Instance:
IBiggyStore<Artist> _artistStore = new SQLServerStore<Artist>("chinook");
  
// Inject the store into the BiggyList Constructor:
IBiggy<Artist> _artists = new BiggyList<Artist>(_artistStore);

 

We have added a bit of ceremony on the front side, in that we have to “new up” a store instance (in this case, a SQL Server store) for injection into our list constructor. Of course, we could do this in-line:

Initializing a new BiggyList Instance with Inline Store Injection:
// Inject a new store into the BiggyList Constructor:
IBiggy<Artist> _artists = new BiggyList<Artist>(new SQLServerStore<Artist>("chinook"));

 

But there is still some additional cognitive overhead (and a lot of repetitive type arguments to deal with!).

Also, because each concrete BiggyStore<T> is tied to a specific type argument <T> (and by extension, a specific database table, depending upon the backing store), we need to initialize a new store for each type-specific list we wish to consume:

Initializing Multiple BiggyList Instances:
// Initialize aseveral Store Instances:
IBiggyStore<Artist> _artistStore = new SQLServerStore<Artist>("chinook");
IBiggyStore<Album> _albumStore = new SQLServerStore<Album>("chinook");
IBiggyStore<Track> _trackStore = new SQLServerStore<Track>("chinook");
// Inject a new store into each BiggyList Constructor:
IBiggy<Artist> _artists = new BiggyList<Artist>(_artistStore);
IBiggy<Album> _albums = new BiggyList<Album>(_albumStore);
IBiggy<Track> _tracks = new BiggyList<Track>(_trackStore);

 

Hmmmm …

Now here we are looking at some repetitive coding, particularly since we need to specify the type argument <T> for each no less than five times . . .

However, since the whole purpose of Biggy is to get your data into memory for fast performance, while keeping things in sync with the backing store, you should mostly be able to do this once within your application, and then you’re off and running.

We shall see how things evolve. As I am learning, there is a balance between a friendly, easy-to-use API and “proper architecture” that is not always clear-cut.

Cache Schema Information

Biggy has a relatively sophisticated system for matching domain objects and properties with database tables and columns. Also, specific to relational database stores, there is the issue of primary keys, and whether or not these are auto-incrementing (“Identity” columns in SQL Server, and “serial” column types under Postgres).

Since Biggy relies heavily upon mapping database object names to domain object names to do its job, it made sense to pull as much schema information from the actual database as possible, then map objects and properties accordingly. We can accomplish this by hitting INFORMATION_SCHEMA once for a list of tables in the database, and again for a list of all the columns in the database. We then map columns to tables in memory, and make these mappings available for comparison to object names and properties.

This cache of schema information can be retrieved during initialization of an IBiggyStore<T> instance, or passed to a constructor override, depending on application requirements.If we needed just a single table’s data, we might just initialize our store using the database connection string name, as previously:

IBiggyStore<Artist> _artistStore = new SQLServerStore<Artist>("chinook");
IBiggy<Artist> artists = new BiggyList<Artist>(_artistStore);

 

No harm no foul above. It still seems a little clunkier than previous versions of Biggy, but not too bad.

Behind the scenes, during store initialization, the cache is still being retrieved, just specific for this instance. In the constructor for the sub-class SQLServerStore<T> we find:

public SQLServerStore(DbCache dbCache) : base(dbCache) { }
public SQLServerStore(string connectionString) 
    : base(new SQLServerCache(connectionString)) { }

 

As we can see, the constructor override is simply initializing a new SQLServerCache instance and passing it to the constructor of the base class, BiggyRelationalStore.

If, on the other hand, we need to spin up several tables (as in our earlier example above) it will make more sense to grab our schema stuff at the start, and pass references in to our store objects:

var schema = new SQLServerCache("chinook");
  
// Initialize several Store Instances, but pass the cached schema info in:
IBiggyStore<Artist> _artistStore = new SQLServerStore<Artist>(schema);
IBiggyStore<Album> _albumStore = new SQLServerStore<Album>(schema);
IBiggyStore<Track> _trackStore = new SQLServerStore<Track>(schema); 
  
IBiggy<Artist> artists = new BiggyList<Artist>(_artistStore);
IBiggy<Album> _albums = new BiggyList<Album>(_albumStore);
IBiggy<Track> _tracks = new BiggyList<Track>(_trackStore);

 

Yup. That looks pretty clunky.

All the abstraction/injection has made the Biggy code base more robust (much more, in my mind), but has made it less friendly from an API perspective.

What to do?

Simpler, More Friendly API or Stronger, More Flexible Library Structure?

The architecture proposed by K. Scott Allen most definitely improved the code organization, created better separation of concerns between the BiggyList and the backing store, and in general has created a code base which is more extensible. However, it has also introduced a good deal more ceremony from an API usage standpoint.

Do we decide, from a project standpoint, to wrap it all up somehow such that the API is simplified, but less extensible? Or do we provide the library as-is, and allow the consumer to decide how to best wrap it up in the context of their project.

I love the new structure, and while there is room for it to evolve (and I am CERTAIN I have missed some easy ways to make it more friendly!), I think the basics are there, and for the moment, the tradeoff is worth it. But that’s just my opinion, and the simple, no-ceremony API Biggy was born with is no longer so simple. And utter simplicity is one of Biggy’s strong points.

Wrap it Up in a Factory of Sorts . . .

Of course, depending on your application requirements, solutions to the simplicity conundrum might be easy to find. For example, If I were whipping up an application today, I might add a thingamajig like so:

Example Biggy Implementation Wrapper:
public class MyDatabase 
{
    DbCache _cache;
    public MyDatabase(string connectionStringName) 
    {
        _cache = new SQLServerCache(connectionStringName);
    }
  
    public IBiggyStore<T> CreateStoreFor<T>() where T : new()
    {
        return new SQLServerStore<T>(_cache);
    }
  
    public IBiggy<T> CreateBiggyList<T>() where T : new() 
    {
        return new BiggyList<T>(CreateStoreFor<T>());
    }
}

 

The above could then be called like so:

Consuming the Example Wrapper:
_db = new MyDatabase("chinook");
  
var artists = _db.CreateBiggyList<Artist>();
foreach (var artist in artists) 
{
    Console.WriteLine(artist.Name);
}

 

. . . Or Wrap it Up in a Context

Alternatively, one could borrow a page from Entity Framework, and go the “context” route:

Example Biggy Context Wrapper:
public class MyDatabaseContext : MyDatabase 
{
    public MyDatabaseContext(string connectionStringName) 
        : base(connectionStringName) 
    {
        this.Artists = this.CreateBiggyList<Artist>();
        this.Albums = this.CreateBiggyList<Album>();
        this.Tracks = this.CreateBiggyList<Track>();
    }
    public IBiggy<Artist> Artists { get; set; }
    public IBiggy<Album> Albums { get; set; }
    public IBiggy<Track> Tracks { get; set; }
}

 

 

The above makes possible the materialization of your data store into memory immediately upon initialization. Data can then be consumed directly, like so:

Consuming the Example Context Wrapper:
_db = new MyDatabaseContext("chinook");
foreach (var artist in _db.Artists) 
{
    Console.WriteLine(artist.Name);
}

 

The above are two similar, but slightly different ways one might consume Biggy within an application. While the new architecture adds a degree of (probably) undesirable ceremony to the simplest use case, it does enable some flexibility and extensibility which may have been more difficult previously.

It could be we wrap the new architecture into something like the above as part of the API for the library. I'm interested to see what others come up with.

Pushed Breaking Changes

At the suggestion of the project owner, I've just pushed my changes to the master repo. This is going to break things for anyone who has built out around the previous structure.

I've revised/Adapted the tests and the perf demos to work with the new architecture, and as of now, await feedback from the project owner as to what is acceptable, and what may need additional work.

More will be revealed . . .

Additional Resources and Items of Interest

 

Posted on April 2 2014 07:29 AM by jatten     

Comments (0)

Building Biggy: Resolving Dissonance Between Domain Objects and Backing Store Entities

Posted on March 31 2014 09:43 PM by jatten in C#, Biggy, Database   ||   Comments (1)

untitled-by-deeashleyI've recently been hard at work contributing to an exciting new open source project, Rob Conery's Biggy. Biggy is just getting off the ground, so things are evolving rapidly, and getting things working, for the moment, trumps elegance. At this point, we are refining things and evolving the structure and capabilities of the project.

Biggy offers a fresh take on maintaining an in-memory, synchronized representation of your data, along with a hybrid document/relational persistence model.

The engine Biggy currently uses to talk to relational databases is a tuned-up, modified version of Massive. Originally a highly dynamic micro ORM, for Biggy we have added strong support for static types, and customized things so that Biggy can materialize relational data into domain POCO objects in a high-performance way.

Image by Dee Ashley | Some Rights Reserved

Biggy supports multiple database platforms. While currently the focus is on SQL Server and Postgresql, the notion is that ultimately, Biggy should be able to support any platform which implements the correct IBiggyStore interface (see K. Scott Allen's recent post positing our upcoming new architecture in this regard).

One of the major problems we needed to solve was the mapping of database objects to domain objects. Yes, this is one of the ever-present issues with ORM design (note that Biggy is NOT an ORM, although it shares some handy features with ORMs).

Mapping Domain Object Properties Using Type Introspection

The original Massive, and Biggy as well, use a wee bit of type introspection to identify object property names. Massive would originally assume that each property would correspond to a like-named database field.

Of course, this is not a safe assumption, especially in cases where one is handed an existing database with names which don't follow convention. Also, certain Db platforms (Postgresql among them) are are case-sensitive with respect to object names, and expect, by convention, that multi-part names will be separated using underscores.

Consider a domain object Artist:

And example Artist Class:
public class Artist 
{
    public int ArtistId { get; set; }
    public string LastName { get; set; }
    public string FirstName { get; set; }
}

 

SQL Server is a fairly forgiving, and makes no distinction between a column named LastName and a column named lastName, or a column named lastname. SQL Server doesn't care about case. SQL Server will require delimiters for a column named Last Name, so this would have to be wrapped up like so: [Last Name].

Postgres, however, is a different story. Postgres comes from a Unix heritage, where all the things are case-sensitive. Postgres expects database object names to be all lower-case unless you tell it otherwise by using double-quotes as delimiters. On top of that, when you pass raw SQL in which includes mixed-case object names, Postgres "helpfully" down-cases everything for you, unless specific object names are wrapped in delimiters. This is all fine and dandy, until you have actual database columns which include upper-case characters. 

In Postgres-land, the object naming convention is to lower-case names, and separate words with an underscore. We expect to see columns such as last_name and first_name, and we don't often see things like LastName or lastName. None of which is to say it doesn't happen, of course. In fact, the default Postgres version of our chosen test database, Chinook, arrives out-of-the-box with proper-cased table and column names. For example, the Artist table contains the columns ArtistId and Name.

NOTE: For those interested, I have created a modified version of the Chinook database with object names which conform to the Postgres convention, and which implements serial primary keys. The version available on Codeplex has neither (though it is a terrific sample database nonetheless!).

If you have an Artist table in Postgres with the columns ArtistId and Name, the following SQL will fail:

This SQL Statement will Fail Against Postgres:
SELECT ArtistId, Name FROM Artist

 

In order for the above query to work, it would need to be re-written as follows:

This SQL will Work Against the Artist Table, but YUCK!
SELECT "ArtistId", "Name" FROM "Artist"

 

Lastly, if one is using Postgres platform conventions, dutifully using all-lower-case table and column names, properly separated with underscores, then under the original implementation of Massive and Biggie, we would have to name our domain objects and object properties accordingly:

Example Artist Class With Postgres-Compatible Object and Property Names:
public class artist 
{
    public int artist_id { get; set; }
    public string last_name { get; set; }
    public string first_name { get; set; }
}

 

For the exceptionally simple artist class, this may not seem such a big deal. However, with a number of more complex classes, a bunch of property names which include underscores as word separators might become is simply painful. Not to mention, idiomatic C# code utilizes either proper-casing or camel-casing. It is nice to respect the conventions of whatever platform, language or framework one is using, even when they mix and clash.

Solving Two Distinct Problems

As we see in the examples above, we are actually faced with two similar, yet overlapping problems:

  • Properly mapping domain objects and properties to the appropriate, corresponding database entity or field.
  • Properly handling cases where database object names need to be delimited in a platform-specific manner when serialized into dynamic SQL.

Since Biggy dynamically assembles SQL based on object properties, it was clear that we had a few choices to make. There were two potential approaches to this problem:

The "Opinionated" Approach

Decide which conventions Biggy will support, and encourage users to use these as standard defaults. Provide a mechanism to override the defaults, but throw until they either explicitly apply either our default convention, or the override mechanism. The "Rails" approach, if you will.

The "Kitchen Sink" approach

Assume that client code will sensibly use a few accepted conventions, and try to map those first. If this fails, then attempt to resolve the mapping through progressive permutation. If that fails, then look for an override mechanism. Fail only if we are unable to find a match using magic. More like the ASP.NET MVC approach, if you will.

I was surprised to find Rob was in favor of the second, "Kitchen Sink" approach. I fully expected him to favor a more opinionated policy. In any case, it was decided we would do everything we reasonably could to resolve column names to object properties.

It was decided Biggy would attempt to resolve name mappings along the following lines:

  1. See if there is a direct match between the column name and the property name. If so, we're done - map it.
  2. See if, by ignoring case and/or stripping spaces and underscores out of the Db Column name, a match could be determined. If so, use it.
  3. If neither of the above successfully produces a match, look for a custom attribute on the property in question which specifies the Db Column to map to.
  4. Throw an informative exception indicating the source of the problem.

It would have been easy to decide that seeking out a custom attribute would be the first thing to do, and under what I consider the "opinionated" approach, this would have been the next course of action failing an initial match. However, in one of my favorite observations this year, Rob pointed out that using custom attributes in such a manner is distasteful because "It throws the DB stuff right into your model and it's the ultimate push of the database right through the abstraction."

"I really dislike attributes because it throws the DB stuff right into your model and it's the ultimate push of the database right through the abstraction."

-Rob Conery

I couldn't agree more. And I wish *I* had said that.

Get the Schema from the Source - Information Schema

All that said, at some level, the code needs to know some things about the database in order to use it. Just not at the domain model level.

Since Biggy is purpose-built to maintain an in-memory representation of application data and sync to the store as needed (on writes and updates), I figured, why not go straight to the source for a canonical schema against which to compare property names to columns. To my way of thinking, standard relational databases already provide access to this canon for schema information - the ASNI Standard collection of Views known as INFORMATION_SCHEMA.

At some level, Biggy needs to understand and know about the database schema in order to do its job. We can query Information_Schema, and then cache the result at the lowest level possible. That being the case, we introduced some new class structures to the Biggy code base.

A Column Mapping Class

At the lowest level, I wanted to be able to represent everything Biggy would need to know about a database column and how it maps to its associated object property. Initially, I created the DBColumnMapping class, as follows:

The DBColumnMapping class:
public class DbColumnMapping
{
    string _delimeterFormatString;
    public DbColumnMapping(string delimiterFormatString)
    {
        _delimeterFormatString = delimiterFormatString;
        this.IsAutoIncementing = false;
        this.IsPrimaryKey = false;
    }
  
    public bool IsAutoIncementing { get; set; }
    public bool IsPrimaryKey { get; set; }
    public Type DataType { get; set; }
    public string TableName { get; set; }
    public string ColumnName { get; set; }
    public string PropertyName { get; set; }
    public string DelimitedColumnName
    {
        get { return string.Format(_delimeterFormatString, this.ColumnName); }
    }
}

 

Notice that the database platform delimiter is passed in as a format string. For example, in the case of SQL Server, the format string would look like this

SQL Server Delimiter Format String:
"[{0}]"

 

As you can see from the DBColumnMapping code, the DelimitedColumnName property uses the string.Format() method to wrap the actual column name in the delimiter characters specific to the platform before returning it to the client code.

The two properties related to Primary key status are set to sensible default values. Most columns are NOT primary keys, and most columns are NOT auto-incrementing.

Next, I wanted a way to easily retrieve the column name for a specific property, or the property name from a specific column. To this end, I created the DBColumnMappingLookup class. I composed this around dual Dictionaries which contain references to the same set of DBColumnMapping objects, but which refer to them using different sets of keys. One is keyed with the property name associated with each DBColumnMapping object, and the other uses the column name as the key. In both cases, the actual DBColumnMapping instance contains all the information we might need about the association, once retrieved.

The DBColumnMappingLookup Class
public class DbColumnMappingLookup
{
    Dictionary<string, DbColumnMapping> ByProperty;
    Dictionary<string, DbColumnMapping> ByColumn;
    string _delimiterFormatString;
  
    public DbColumnMappingLookup(string NameDelimiterFormatString)
    {
        _delimiterFormatString = NameDelimiterFormatString;
        this.ByProperty = new Dictionary<string, DbColumnMapping>
            (StringComparer.InvariantCultureIgnoreCase);
        this.ByColumn = new Dictionary<string, DbColumnMapping>
            (StringComparer.InvariantCultureIgnoreCase);
    }
  
    public int Count()
    {
        return this.ByProperty.Count();
    }
  
    public DbColumnMapping Add(string columnName, string propertyName)
    {
        string delimited = string.Format(_delimiterFormatString, columnName);
        var mapping = new DbColumnMapping(_delimiterFormatString);
        mapping.ColumnName = columnName;
        mapping.PropertyName = propertyName;
        // add the same instance to both dictionaries:
        this.ByColumn.Add(mapping.ColumnName, mapping);
        this.ByProperty.Add(mapping.PropertyName, mapping);
        return mapping;
    }
  
    public DbColumnMapping Add(DbColumnMapping mapping)
    {
        this.ByColumn.Add(mapping.ColumnName, mapping);
        this.ByProperty.Add(mapping.PropertyName, mapping);
        return mapping;
    }
  
    public DbColumnMapping FindByColumn(string columnName)
    {
        DbColumnMapping mapping;
        this.ByColumn.TryGetValue(columnName, out mapping);
        return mapping;
    }
  
    public DbColumnMapping FindByProperty(string propertyName)
    {
        DbColumnMapping mapping;
        this.ByProperty.TryGetValue(propertyName, out mapping);
        return mapping;
    }
  
    public bool ContainsPropertyName(string propertyName)
    {
        return this.ByProperty.ContainsKey(propertyName);
    }
  
    public bool ContainsColumnName(string columnName)
    {
        return this.ByColumn.ContainsKey(columnName);
    }
}

 

The Add() method adds each new DBColumn instance to both dictionaries, using the appropriate property value as the key. There is also an overridden Add() method by which a new DBColumn can be added implicitly by passing a column name and a property name as arguments.

Given the above, now, if I have a particular column name and I need to find the matching property, I can simply pass the column name as the argument to the FindByColumn() method and the DBColumn Object will be returned, from which I can grab the associated property name (and any other column details I need).

Likewise, I can test to see if either a specific property, or a specific column is represented in the lookup, by passing the appropriate string to the ContainsColumnName() or ContainsPropertyName() methods.

But wait, john - what if the same column or property name exists for more than one table or object?

Well, that's where the DBTableMapping class comes in.

Mapping Columns to Tables, and Tables to Domain Types

The final piece in the puzzle, the DBTableMapping class gathers all of the columns together for each table, as well as performing similar name mapping function for tables and domain objects.

The DBTableMapping Class:
public class DBTableMapping
{
    string _delimiterFormatString;
    public DBTableMapping(string delimiterFormatString)
    {
        _delimiterFormatString = delimiterFormatString;
        this.ColumnMappings = new DbColumnMappingLookup(_delimiterFormatString);
        this.PrimaryKeyMapping = new List<DbColumnMapping>();
    }
    public string DBTableName { get; set; }
    public string MappedTypeName { get; set; }
    public string DelimitedTableName
    {
        get { return string.Format(_delimiterFormatString, this.DBTableName); }
    }
    public List<DbColumnMapping> PrimaryKeyMapping { get; set; }
    public DbColumnMappingLookup ColumnMappings { get; set; }
}

 

Once again, we initialize DBTableMapping with the database platform delimiter format string. Also not that the DBTableMapping contains a List<DBColumnMapping> specifically to hold references to the primary key(s) for the table in question. While not common, composite PKs do occur in the wild.

Cache Schema Information and Inject into the Store

Coming down the pike very quickly is a revamped architecture for Biggy. We will see the in-memory list functionality cleanly separated from the backing store implementation. At the same time, we might add a "DBCache" object which would be (optionally) injected into the store. The DBCache object would allow a single round trip to the database to retrieve schema info, into the structures described above, which could be passed by reference to store instances at initialization.

The store itself would be concerned with managing the serialization and deserialization of data between the database, the domain objects consumed by Biggy, and your application code. Sitting on top of the store would be an instance of IBiggy itself. The concrete implementation of an IBiggy would be concerned only with managing the in-memory data, and calling into the store to retrieve, write, or synchronize data.

All that said, the DBCache would be the ideal place to cache all of our database schema info. With two quick pokes at Information_Schema at initialization, we can grab everything we need. Let's take a look at how this would work in a SQL Server implementation.

First, the cache (I've omitted non-relevant code to focus on retrieving and loading our column mapping data, and I've simplified things a bit). The cache consists of a base class, (in our case, the DBCache), and a subclass specific to the database platform:

The BiggyRelationalContext Class (Simplified):
public abstract class DBCache
{
    public abstract string DbDelimiterFormatString { get; }
    protected abstract void LoadDbColumnsList();
    protected abstract void LoadDbTableNames();
  
    public string ConnectionString { get; set; }
    public List<DbColumnMapping> DbColumnsList { get; set; }
    public List<string> DbTableNames { get; set; }
  
  
    public DBCache(string connectionStringName)
    {
        ConnectionString = ConfigurationManager
            .ConnectionStrings[connectionStringName].ConnectionString;
        this.LoadSchemaInfo();
    }
  
  
    public void LoadSchemaInfo()
    {
        this.LoadDbTableNames();
        this.LoadDbColumnsList();
    }
    // ... A bunch of other code here unrelated to mapping
}

 

In the above, you can see we have a List<DBColumnMapping> and a List<string> representing database columns and table names, respectively. We are going to use the call to LoadSchemaInfo() to pull in the table names, and the column info from the database using Information_Schema. However, LoadSchemaInfo() in turn calls two abstract methods which are defined on our subclass. In this case, SQLServerCache:

The SQLSeverCache Subclass of DBCache:
public class SQLServerCache : DbCache 
{
    public SQLServerCache(string connectionStringName) 
        : base(connectionStringName) { }
  
    public override string DbDelimiterFormatString 
    {
        get { return "[{0}]"; }
    }
  
    public override DbConnection OpenConnection() 
    {
        var conn = new SqlConnection(this.ConnectionString);
        conn.Open();
        return conn;
    }
  
    protected override void LoadDbColumnsList() 
    {
        this.DbColumnsList = new List<DbColumnMapping>();
        var sql = ""
        + "SELECT c.TABLE_NAME, c.COLUMN_NAME, "
        + "  CASE tc.CONSTRAINT_TYPE WHEN 'PRIMARY KEY' THEN CAST(1 AS BIt) ELSE CAST(0 AS Bit) END AS IsPrimaryKey,  "
          + "  CASE (COLUMNPROPERTY(object_id(tc.TABLE_NAME), kcu.COLUMN_NAME, 'IsIdentity')) WHEN 1 THEN CAST(1 AS Bit) ELSE CAST(0 AS Bit) END as IsAuto "
        + "FROM INFORMATION_SCHEMA.COLUMNS c "
        + "LEFT OUTER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu "
        + "ON c.TABLE_SCHEMA = kcu.CONSTRAINT_SCHEMA AND c.TABLE_NAME = kcu.TABLE_NAME AND c.COLUMN_NAME = kcu.COLUMN_NAME "
        + "LEFT OUTER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc "
        + "ON kcu.CONSTRAINT_SCHEMA = tc.CONSTRAINT_SCHEMA AND kcu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME";
  
        using (var conn = this.OpenConnection()) 
        {
            using (var cmd = conn.CreateCommand()) 
            {
                cmd.CommandText = sql;
                var dr = cmd.ExecuteReader();
                while (dr.Read()) 
                {
                    var clm = dr["COLUMN_NAME"] as string;
                    var newColumnMapping = new DbColumnMapping(this.DbDelimiterFormatString) {
                        TableName = dr["TABLE_NAME"] as string,
                        ColumnName = clm,
                        PropertyName = clm,
                        IsPrimaryKey = (bool)dr["IsPrimaryKey"],
                        IsAutoIncementing = (bool)dr["IsAuto"]
                    };
                    this.DbColumnsList.Add(newColumnMapping);        
                }
            }
        }
    }
  
    protected override void LoadDbTableNames() 
    {
        this.DbTableNames = new List<string>();
        var sql = "SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbo'";
        using (var conn = this.OpenConnection()) 
        {
            using (var cmd = conn.CreateCommand()) 
            {
                cmd.CommandText = sql;
                var dr = cmd.ExecuteReader();
                while (dr.Read()) 
                {
                    this.DbTableNames.Add(dr.GetString(0));
                }
            }
        }
    }
}

 

In the subclass SQLServerCache, we find the implementation for LoadDBTableNames() and LoadDbColumnsList(), which fetch the table name and column info we need, respectively. That big, ugly blob of raw SQL retrieves all the columns in our database, and includes all the information required for each to properly populate a DBColumnMapping object. Except, of course, the mapped property name. That's next. For now, we just fill it in with the column name. We will overwrite it later as appropriate.

Output from Big Ugly SQL from Information_Schema:

sql-get-column-info-from-information-schema

Mapping Table and Column Data at the Table Level

Now, after all that ceremony, we get to the heart of the matter. We will add a method to the base class DBCache which accepts a type argument <T>, and attempts to map that type to a database table, and then maps the table columns to the type properties.

Ready for a monolithic wall of code? Good! There are some potential refactorings in here, but for the most part, we have a lot of layered type introspection going in, and for now it made as much (more) sense to leave it all together. Here it is, I give you, the getTableMappingForT() method:

The GetTableMappingForT Method:
public virtual DBTableMapping getTableMappingFor<T>() where T : new()
{
    // This is what we want to return to the caller:
    var tableMapping = new DBTableMapping(this.DbDelimiterFormatString);
  
    // Always cache the results of reflection-based calls where possible:
    var item = new T();
    var itemType = item.GetType();
    var properties = itemType.GetProperties();
  
    string replaceString = "[^a-zA-Z1-9]";
    var rgx = new Regex(replaceString);
  
    // Get a down-cased version of the type name
    string flattenedItemTypeName = 
        rgx.Replace(itemType.Name.ToLower(), "");
  
    // Get a pluralized version, in case the table is pluralized:
    string plural = Inflector.Inflector
        .Pluralize(flattenedItemTypeName);
  
    // Does the fully-lower-case type name match any fully-lower-case table name?
    var dbTableName = this.DbTableNames
        .FirstOrDefault(t => 
            rgx.Replace(t.ToLower(), "") == flattenedItemTypeName);
    if (dbTableName == null)
    {
        // If not, does the plural match any table name?
        dbTableName = this.DbTableNames
            .FirstOrDefault(t => rgx.Replace(t.ToLower(), "") == plural);
    }
    else
    {
        // Is the type decorated with a DBTableName Attribute?
        var tableNameAttribute = itemType.GetCustomAttributes(false)
            .FirstOrDefault(a => a.GetType() == 
                typeof(DbTableAttribute)) as DbTableAttribute;
        if (tableNameAttribute != null)
        {
            dbTableName = tableNameAttribute.Name;
        }
    }
  
    tableMapping.DBTableName = dbTableName;
    tableMapping.MappedTypeName = itemType.Name;
  
    // Find the column info for this table in the local DBColumnsList:
    var dbColumnInfo = from c in this.DbColumnsList 
                       where c.TableName == dbTableName select c;
    foreach (var property in properties)
    {
        // Downcase the property name
        string flattenedPropertyName = rgx.Replace(property.Name.ToLower(), "");
        // does the down-cased property name match any donw-cased column name?
        DbColumnMapping columnMapping = dbColumnInfo
            .FirstOrDefault(c => 
                rgx.Replace(c.ColumnName.ToLower(), "") == flattenedPropertyName);
        if (columnMapping != null)
        {
            // use it as-is
            columnMapping.PropertyName = property.Name;
            columnMapping.DataType = itemType;
        }
        else
        {
          // Look for a custom column name attribute:
          DbColumnAttribute mappedColumnAttribute = null;
          var attribute = property.GetCustomAttributes(false)
              .FirstOrDefault(a => a.GetType() == typeof(DbColumnAttribute));
          if (attribute != null)
          {
              // Use the column name found in the attribute:
              mappedColumnAttribute = attribute as DbColumnAttribute;
              string matchColumnName = mappedColumnAttribute.Name;
              columnMapping = dbColumnInfo
                  .FirstOrDefault(c => c.ColumnName == matchColumnName);
              columnMapping.PropertyName = property.Name;
              columnMapping.DataType = itemType;
          }
        }
        if (columnMapping != null)
        {
            // Add the column mapping instance to the 
            // DBTableMapping.ColumnMappings list:
            tableMapping.ColumnMappings.Add(columnMapping);
            if (columnMapping.IsPrimaryKey)
            {
                // Add the DBColumnMapping Instance to 
                // the list of primary key columns for the current table:
                tableMapping.PrimaryKeyMapping.Add(columnMapping);
            }
        }
    }
    return tableMapping;
}

 

The getTableMappingForT method above returns a complete mapping of the table represented by the type <T> to the caller (in this case, an instance of SQLServerStore). The Store can then use the mapping to rapidly retrieve the proper database object mappings for various domain objects as needed to build SQL on-the fly, serialize/de-serialize objects, and ensure that SQL pushed into the database is properly delimited by default.

Custom Attributes as a Last Resort

Notice in the above code there is a point where, while attempting to match a table name, if we have not found a match by any other means, we look for any custom attributes of type DBTableAttribute. Likewise when trying to match column names, we eventually end up looking for a DBColumnAttribute.

In other words, when all else fails, we look to see if the user has decorated a class or property with a custom attribute which provides a name mapping. In reality, the down-cases name comparison, the pluralized comparison, and the regex which allows comparison with any odd characters stripped out, we have likely covered 95% of the name mismatch cases.

But sometimes you may need to map a property to a column name which, for whatever reason, is not going to match. in this case, you can break out the EF style Custom attributes we added for just this purpose.

Custom Attributes for when Nothing Else Works:
public class DbColumnAttribute : Attribute
{
    public string Name { get; protected set; }
    public DbColumnAttribute(string name)
    {
        this.Name = name;
    }
}
  
public class DbTableAttribute : Attribute
{
    public string Name { get; protected set; }
    public DbTableAttribute(string name)
    {
        this.Name = name;
    }
}

 

Putting it All Together

Well, gee whiz, John, that was a long rambling tour. Why do I care?

Well, now Biggy can provide automatic table and column mapping out of the box, 95% of the time. In those edge cases where Biggy can't automagically figure out how to map database objects to domain objects, you can always add some custom attributes to make it work.

For example, something which happens frequently under the hood is that Biggy needs to de-serialize data incoming from the backing store into a proper domain object. If we take a look at some code from the BiggyRelationalStore class, we can see that after initialization, the store reaches into the injected context to retrieve a DBTableMapping instance.

Code Excepted from Base Class BiggieRelationalStore:
// Constructor for Base Class:
public BiggyRelationalStore(DbCache dbCache) 
{
    this.Cache = dbCache;
    this.tableMapping = this.getTableMappingForT();
    // Is there an auto PK? If so, set the member variable:
    if(this.tableMapping.PrimaryKeyMapping.Count == 1) 
    {
        if(this.tableMapping.PrimaryKeyMapping[0].IsAutoIncementing) 
        {
            this.PrimaryKeyMapping = 
                this.tableMapping.PrimaryKeyMapping[0];
        }
    }
}
  
  
public virtual DBTableMapping getTableMappingForT() 
{
    return this.Cache.getTableMappingFor<T>();
}
  
// A bunch of other code...
  
internal T MapReaderToObject<T>(IDataReader reader) where T : new() 
{
    var item = new T();
    var props = item.GetType().GetProperties();
    foreach (var property in props) 
    {
        if (this.PropertyColumnMappings.ContainsPropertyName(property.Name)) 
        {
            string mappedColumn = 
                this.PropertyColumnMappings
                    .FindByProperty(property.Name).ColumnName;
            int ordinal = reader.GetOrdinal(mappedColumn);
            var val = reader.GetValue(ordinal);
            if (val.GetType() != typeof(DBNull)) 
            {
                property.SetValue(item, reader.GetValue(ordinal));
            }
        }
    }
    return item;
}

 

After initialization, the DBTableMapping instance is available locally, so that methods such as MapReaderToObject can map object properties to database columns, retrieve the data from a DataReader, and hydrate a domain object instance.

Explore the Code

Much of the code you see above is in the final stages of development, but has not yet been pushed to the master branch of the Github repomaster branch of the Github repo. I expect it will be soon. In the meantime, I strongly encourage you to pull down the current code and see what we've got happening. The project is young, and it will be evolving rapidly in the weeks to come. There is a lot to see up there, and I hope you both find something useful, and also shoot us some feedback. Especially bug reports!

Additional Resources and Items of Interest

 

Posted on March 31 2014 09:43 PM by jatten     

Comments (1)

About the author

My name is John Atten, and my username on many of my online accounts is xivSolutions. I am Fascinated by all things technology and software development. I work mostly with C#, Java, SQL Server 2012, learning ASP.NET MVC, html 5/CSS/Javascript. I am always looking for new information, and value your feedback (especially where I got something wrong!). You can email me at:

jatten@typecastexception.com

Web Hosting by