Monday, April 14, 2008

In my previous post WCF And Large Messages. I mentioned there was a better way to send large data. As I have been getting a lot of traffic on this topic here is the improved methodology:

One of the really sweet features of WCF is to allow the streaming of messages between client and server. By default messages are buffered and once completely built they are sent.

While this works great for small messages once you start sending large amounts of data (in my case a 50-70Mb file) it really pays off. For my case sending data as a large message took an average of 23.3 seconds in the standard buffer and burst mode method described here. Doing this via streaming only took an average of 4 seconds using the streaming method.

Streaming is only supported under the basicHttpBinding, netTcpBinding, and netNamedPipeBinding bindings. If you are hosting your service in IIS6 your only option is to use basicHTTPBinding (or create your own binding but that is outside the scope of this post). If you are hosting your IIS7 then you will be able to use the TCP, named pipes, and the msmq bindings as well.

To enable streaming was surprisingly simple. All I had to do was create a new binding configuration:

  <basicHttpBinding>
        <binding name="StreamingFileTransferServicesBinding" 
                 transferMode="StreamedRequest"  
                 maxBufferSize="65536"
                 maxReceivedMessageSize="204003200"  />
  </basicHttpBinding>

And then set my service to use that binding configuration:

      <service behaviorConfiguration="MyBehaviour" name="MyStreamingService">
        <endpoint address="" 
binding="basicHttpBinding"
bindingConfiguration="StreamingFileTransferServicesBinding"
contract="IMyStreamingService" /> <endpoint address="mex"
binding="mexHttpBinding"
contract="IMetadataExchange" /> </service>

To dissect this a bit I have setup a buffer size and a maxMessageReceive size which controls how much data is buffered before it is sent and how big those messages can be. To be honest I have not played with these settings very much yet so you will probably want to tweak these to your own situations.

Also in the binding configuration there are several different streaming types we can setup:

Streamed - Both in and out messages are streamed
StreamedRequest - Messages sent from client to server are streamed
StreamedRespone - Only messages returned from the server to the client are streamed
Buffered - This is the default of buffering all data and sending it in one burst

A BIG thing to note is that when using streams the only allowed data types are Message, Stream, or an IXMLSerializable implementation for ALL methods in your service! If we use "Streamed" as our transfer mode then we would need to have BOTH our input parameters and our return value be one of these types. If you just want to send data and return back some small data object or primitive then use StreamedRequest or StreamedResponse.

Onwards to code!

Function ProcessFile(ByVal data As Stream) As DataContracts.ValidatedAuthority

As you can see here my interface is pretty simple. It takes in a stream of data and returns a simple object that shows how the file processing went.

As I mentioned before that because we are using streaming that all methods must take only Streams, Message, or IXMLSerializeable as parameters. If you want to have methods that do not require this then create a new service that does not use the streaming behaviour.

Now if you are hosting in IIS you will still need to let the HTTPRuntime that you are sending large data with: <httpRuntime maxRequestLength="73400" executionTimeout="100" /> (Or whatever settings you think are appropriate).

A little housekeeping note is that you will need to dispose the stream on both the client and server. This is because there are actually two streams in two different app domains so both client and server will need to treat them as such.

Also for completeness here is my entire service model config section:

<system.serviceModel>
    <services>
      <!--Streaming Service-->
<service behaviorConfiguration="MyBehavious" name="MyStreamingService"> <endpoint address=""
binding="basicHttpBinding"
bindingConfiguration="StreamingFileTransferServicesBinding"
contract="IMyStreamingService" /> <endpoint address="mex"
binding="mexHttpBinding"
contract="IMetadataExchange" /> </service>
    </services>
    <behaviors>      
<serviceBehaviors>
<behavior name="MyBehavior"> <serviceMetadata httpGetEnabled="true" /> <serviceDebug includeExceptionDetailInFaults="true" /> </behavior>
</serviceBehaviors>
</behaviors>
<bindings> <basicHttpBinding> <binding name="StreamingFileTransferServicesBinding" transferMode="StreamedRequest" maxBufferSize="65536" maxReceivedMessageSize="204003200" /> </basicHttpBinding>
</bindings>
</system.serviceModel>
 

I recently had to load a lot of comma separated data into a file never knew how easy it was to load CSV data into a table. Here is the t-sql:

BULK
INSERT Address
FROM 'c:\address.csv'
WITH
    (
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    FIRSTROW = 2
    )

So handy to have this feature. Simply just point the from to the delimited data on disk and set your delimiters. In my case I am using a comma for the field separator but you could use any character (use \t for tab).

One other common thing is that the first row in your CSV file is the header information (as is the case in my example). If you want to specify to ignore the first row and start reading data from the second row use FIRSTROW=2 (as shown) to skip the header record.