( Index )
Month

 

Brief Information about the June '17 CSIG Meeting

ACGNJ.exe - String Handling for Processing Data - Windows C# with .Net

by Bruce Arnold .

 

Text Sample

Welcome to the CSIG, a Special Interest Group of the ACGNJ. This is an exciting time for the C Language programming since Microsoft now has 2 different language compilers: C++ and C-Sharp. These are all capable of creating Windows (tm) programs. They are included in the latest free Visual Studio from Microsoft.  During the meeting there will be time for Random Access Questions as well as other discussions.  This lecture will show an example of converting last months "console text" program into a true "Windows GUI" program.

Here's a brief synopsis of the coming meeting:

ACGNJ.exe - String Handling for Processing Data - Windows Version, C# with .Net

 

One of the great powers of C is it's speed and ability to handle strings in data applications.  For tonight's meeting we will discuss a text based application for filtering a data file to provide a better output.  Back in March/April 2014, I presented a program called Esenders and WEsenders.  (Text Console based and then Windows based.)  These applications read all of your emails on a typical IMAP server and extracted the mailing addresses of all of the senders.  The raw data file was typically about 4000 lines for every 1000 emails in your INBOX. The program stopped there without processing the data any further.  The program for this meeting is called W FILTER WESEND.  It takes the original output file and continues.  1) It reads all of the lines and outputs only those lines starting with the string FROM:  2) It then sorts the lines.  3) It removes any duplicates.  And, finally 4) It outputs a list of the unique email addresses to a LISTBOX (see above) that it may be pasted to a file if desired.  Even though this month's program is a true Windows program, it still processes a MILLION LINE data file in less than 2 seconds!  As shown in the list below, it calls upon class methods from some of the most sophisticated Microsoft Class Libraries.

Last month's program uses Microsoft DOT NET C# code (CLI) to create a Windows Console Application.  Source code is available as well as an execuitable.

This month's program uses Microsoft DOT NET C# code (CLI) to create a Windows Form Application using most of the same code.  Source code is available as well as an execuitable.

The program uses some of the most sophisticated Dot Net Library functions including the following:

String and File handling functions.

Console output functions.
System.Collections.Generic
System.Linq
System.Text
System.IO
System.Text.RegularExpression

https://msdn.microsoft.com/en-us/library/mt472912(v=vs.110).aspx


There are a number of ways to refer to Microsoft's latest compilers and code. Here's what Wikipedia says: The Common Language Infrastructure (CLI) is an open specification developed by Microsoft that describes the executable code and runtime environment that form the core of the Microsoft .NET Framework. The specification defines an environment that allows multiple high-level languages to be used on different computer platforms without being rewritten for specific architectures.

Microsoft .Net Framework 4.5
C++ / C#
Visual Studio Community 2015
CLI
Common Language Infrastructure
Managed

Sample Code

using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; using System.Text.RegularExpressions; namespace FilterWesend { class Program { static void Main(string[] args) { string path = "emailresponse.txt"; string rawdata = ReadRawData(path); List<string> processedData = Process(rawdata); processedData.Sort(); var noDupsList = new HashSet<string>(processedData).ToList(); noDupsList.ForEach(Print); string dashes = "--------------------------------------------------"; Console.WriteLine(string.Format("\n{0}\n", dashes)); int count = rawdata.Count(c => (c == '\n')) + 1; Console.WriteLine(string.Format("Lines read = {0}\n", count)); count = processedData.Count; Console.WriteLine(string.Format("Lines starting with FROM: = {0}\n", count)); count = noDupsList.Count; Console.WriteLine(string.Format("Unique lines written = {0}\n", count)); Console.Read(); // wait for user. } // // Process by saving only lines starting with "FROM: " // private static List<string> Process(string rawdata) { string[] linebuff = Regex.Split(rawdata, "\r\n"); List<string> result = new List<string>(); string tmp; for (int idx = 0; idx < linebuff.Length; ++idx) { tmp = linebuff[idx]; if (tmp.StartsWith("From: ")) result.Add(linebuff[idx].Substring(6)); } return result; }

SOURCE CODE

Source Code Files

For help, email me at b a r n o l d @ i e e e . o r g
Back to C++ Main Page