Saturday, January 12, 2013

What is Git and Github?

What is Git?

Per definition, Git is a distributed revision control [...] with an emphasis on speed [Wikipedia]. Let's break down those terms.

  • Distributed: The data is spread out in identical copies on multiple places, akin to torrent technology (peer-to-peer).
  • Revision control: Is the same as "version control" or "source control". It is the method by which you control, save and load different versions of your code and its changes over time.

Git implements the concept of branching. It's like a tree-branch, growing out from a stem.



A branch is initially a full copy of the original project (master). The branch can be updated and saved without affecting the master. When the branch is ready to be included into the original project it is then merged with the master - putting the code of the branch into the master.

Git is usually hosted online at a repository, such as Github. But it can also be locally on your computer. If you use an online repository, the online project and your local project will be exactly the same.



Two other essential Git concepts are "pulling" and "pushing". Pulling gets the source code from a repository. Pushing puts changes of your code into the repository.

It is important to understand that there is no real difference between an online, hosted master and offline master in your local filesystem. Almost everything you can do with an online master you can do with an offline one. Do not confuse Git with Github. Git is the system used to manage your data. Github just helps you with hosting this data online.

Why would I use Git?

Git will help you store and manage your data. Indeed, that is the very point of version control.

There is no real difference between using Git locally or online: if you work solo and on one computer. But putting your project online will give you increased flexibility - if you want to share the project to others or download at a new location, the new copy being exactly the same as on your computer. It is distributed, remember?

If you are multiple people working on the same project, Git will provide easy online-access to your data (if you use a host such as Github). Using branches you can implement a feature without messing with the original code and meddling with your other team-members by submitting incomplete code. Just create a branch instead.

It's also certainly worth linking your Github account in your resumé if you have some nice code hosted there. Github is a well respected site, and many people know what it is and how to use it. And those that do not will surely be impressed by all your amazing computer code mumbo-jumbo.

Installing Git

The easiest way to install Git is to head to http://git-scm.com/ and download an easy client-bundle for your operating system.
You can also download some specialized GUIs with added functionality http://git-scm.com/download/gui/win

But we will use the first way, which gives you acces to the Git Bash console.




Now, if you sit on Windows the commands accepted by the Git Bash console will be different from what you are used to. It only takes unix commands which you can read more on here.

Configuring Git

So fire up your fresh Git Bash and let's get to it.

Some initial necessary configurations:
# the name shown as you commit changes
git config --global user.name "firstname surname"
# you don't have to set a valid email (asd@asd.com is just fine) but it must be set. If you use Github, your email will function as your identification
git config --global user.email "email@mail.com" 
Not needed, but very useful.
# pushes all branches relevant to the project
git config --global push.default "matching"
# pull changes before pushing for safety
git config --global branch.autosetuprebase always 
Navigate to your project and create git files necessary to manage your project (the file structure here is for Windows)
# navigate
cd C:\MyProject
# create git files
git init
You can now index the files - telling Git what files you want to manage.
# this will add all your files to Git. The . is a shorthand for "all"
git add .
If you are happy with your code, you can your files under local git management
git commit -m "My first commit"
Now you are done with local settings, and you can stop here unless you want to put your code online.

Dealing with Github

Disclaimer: The way we have approached Git so far is project-first. We assume you have some existing files you want to upload. There are other ways to do it, but we won't do that now.

Head over to Github and create your account.
Then press this icon beside your user to create a repository:


Setup a repository for your project:


You will see some text. Copy this (specific to you). Be sure to select HTTP - it's a much simpler but less secure version, but just fine for our purposes.


Now you can add a shorthand for your online git repository, so you don't have to type the URL every time (this might as well be a path on your local filesystem) Press Insert to paste into the console.
git remote add MyShortHand https://git@github.com:BlackOdd/MyProject.git
Now push the changes to your repository.
git push MyShortHand
Write the password for your Github account (which must be identical to the email you configured in the beginning, and also present on your account emails) - and you're done!

Other useful commands

If anyone makes changes, you can pull them. This will update your files to be the same as the master.
git pull MyShortHand
If you have made changes to your indexed files you must use the -a flag while submitting files to git management.
git commit -a -m "I've done some changes"
If you want to create a copy of the master in another folder, just use this.
git clone MyShortHand

Alternatives to Github

The main drawback with Github is that if you want to hosting for free, your project must be public for everyone to see and download.

I personally use Bitbucket https://bitbucket.org/ as it has no such limitations, giving you the choice to make your project public or private at no cost.

For more specific needs (user amount, project size, project amount etc) you can check out these:

http://repositoryhosting.com/
https://codeplane.com/
http://beanstalkapp.com/

Useful resources

Git tutorial by Lars Vogel: http://www.vogella.com/articles/Git/article.html
Official online and free training for Git: http://training.github.com/web/free-classes/
An exhausitive and free book for Git: http://git-scm.com/book

Friday, December 14, 2012

[C#] Read and write process memory in Windows

Disclaimer: I take no moral standpoint or legal responsibility of what you do with this information, just be aware of the consequences of your actions before you commit to them. (Tip: read license agreements)

Here is the code for this post:
http://www.mediafire.com/?43rm36a3wj219lz
http://pastie.org/5528082

Today we will manipulate a simple application, Notepad, through memory reading, writing and C# code.

Process in the System.Diagnostics namespace holds the GetProcessByName method, which returns a list of all processes with that name.
Process[] p = Process.GetProcessesByName("notepad");
We must now define a method which interacts with the Windows native libraries to read data from a process. Available through the System.Runtime.InteropServices lies OpenProcess in Kernel32, which gives us a handle we need later to both read and write in the memory allocated to the process.
[DllImport("kernel32.dll")]
public static extern int OpenProcess(uint dwDesiredAccess, bool bInheritHandle, int dwProcessId);
We must set our desired level of Access Rights for our handle in the dwDesiredAccess parameter. This is not entirely simple. In essence, we create a flag depending on what we want to do, combining all the parameters into one. The following will give us full access.
uint DELETE = 0x00010000;
uint READ_CONTROL = 0x00020000;
uint WRITE_DAC = 0x00040000;
uint WRITE_OWNER = 0x00080000;
uint SYNCHRONIZE = 0x00100000;
uint END = 0xFFF; //if you have Windows XP or Windows Server 2003 you must change this to 0xFFFF
uint PROCESS_ALL_ACCESS = (DELETE | READ_CONTROL | WRITE_DAC | WRITE_OWNER | SYNCHRONIZE | END);
Let's assume you only have one Notepad process for simplicity. Since we have the Notepad process, get can get it's ID or dwProcessId. You can now get the handle like this.
int processHandle = OpenProcess(PROCESS_ALL_ACCESS, false, p[0].Id); 
The handle is required to call the two methods we use for memory reading and writing within Kernel32. Define them as the following:
[DllImport("kernel32.dll")]
public static extern bool ReadProcessMemory(int hProcess, int lpBaseAddress, byte[] buffer, int size, int lpNumberOfBytesRead);

[DllImport("kernel32.dll")]
public static extern bool WriteProcessMemory(int hProcess, int lpBaseAddress, byte[] buffer, int size, int lpNumberOfBytesWritten);
To simplify things:
public byte[] ReadMemory(int adress, int processSize, int processHandle) {
    byte[] buffer = new byte[processSize];
    ReadProcessMemory(processHandle, adress, buffer, processSize, 0);
    return buffer;
}

public void WriteMemory(int adress, byte[] processBytes, int processHandle) {
    WriteProcessMemory(processHandle, adress, processBytes, processBytes.Length, 0);
}
Let's try reading first. We have the processHandle parameter, but not the adress or size of the allocated memory we are going to read from the adress position. You can get these by a number of simple or advanced methods depending on what process you want to look at. The simplest way is to use Cheat Engine.

Open notepad and find the memory location with Cheat Engine (be sure to set it to Unicode).












































On my computer the string "Hello!" has been located on (0x)0026D3F0. But how big is it in memory? This is not simple in most cases. For this case, we can send the string "Hello!" to memory and read how big it is though with the following:
public int GetObjectSize(object TestObject) {
    BinaryFormatter bf = new BinaryFormatter();
    MemoryStream ms = new MemoryStream();
    byte[] Array;
    bf.Serialize(ms, TestObject);
    Array = ms.ToArray();
    return Array.Length;
}

int processSize = GetObjectSize("Hello!");
Now we can finally read from the Notepad process!
Console.WriteLine((Encoding.Unicode.GetString(ReadMemory(0x0026D3F0, processSize, processHandle))));















Writing is easy too.
WriteMemory(0x0026D3F0, Encoding.Unicode.GetBytes("Wow!" + "\0"), processHandle);
















The reason we get a trailing "!" is because it still lingers in memory. We did not overwrite the entire allocated memory for "Hello!" with our "Wow!" string.

Wednesday, December 12, 2012

[C#] Polymorphism: Exploring upcast references retaining data from child upon downcast

class Parent {}

class Child1: Parent {
    public int childInt;
}

class Child2: Parent {}

class Child3: Parent {
    public int childInt;
}

static void Main(string[] args) {
    var p = new Parent();
    var c1 = new Child1();

    c1.childInt = 5;
    p = c1; //Upcast
    c1 = (Child1) p; //Downcast
    //p.childInt not possible
    Console.WriteLine(c1.childInt); //5       

    var c2 = new Child2();
    //c2 = (Child2)p; this is legal but fails at runtime

    var c3 = new Child3();
    //c3 = (Child3)p; this is legal but fails at runtime

    c3 = p as Child3;
    if(c3 == null) //true
       Console.WriteLine("null"); 
}
It would seem logical if the Parent instance lost the data from the Child upon conversion, but there is actually no conversion going on. It's only a reference.
Since we deal with a reference, p simply points to c1 - but p is not exposing the childInt field.

Now, c1 = p forces a cast since we may have other child-classes, say Child2, and Child2 might not have the fields we require. It could potentially fail at runtime (where p = c1 will never fail because upcasts cannot).

And, as we see in the commented code, c1 will have retained it's value. Makes sense when you think about it, since it's all references, but it may not seem obvious at first.

Interesting to note is that if we cast p to an identical class to Child1 with the same fields, Child3, it also fails.

If you really wanted to do an unsafe downcast, you can use the as operator. But if the conversion fails, it will default to null. This is because as has no idea how to convert our objects.

Thursday, December 6, 2012

[C#] Generics: Testing what default(T) returns

default(T) seems to accept all types which originate from object.
It's said that it returns null for all reference types, and 0 for all value types. Let's see about that.

Reference Types
Value Types

Some code to get you started (Visual Studio): http://www.mediafire.com/?z908gj1imzibimh

Let's make about the simplest class possible to get the result of default(T) to a Console.WriteLine()
public class Test<T>
{        
    private T defaultData;

    public Test()
    {
        defaultData = default(T);
    }  
     
    public String GetDefaultData()
    {
        return ((bool)object.Equals(defaultData, null) ? "null" : defaultData.ToString());
    }
}
The GetDefaultData method exists because defaultData will result to null in some cases, and we want to print the result to the Console (which only takes strings / char-arrays).
As we cannot call ToString() on null because it's not a real object, we have to use object.Equal, which is a static method of the static class object. It compares two objects to see if they are equal.
(Technically null is an object, but it refers to to the location of no object, so no methods can be called upon it.)

If you are confused about the code after, it's called the "if else shorthand" or "conditional operator". You read it as the following:

condition ? condition_is_true : condition_is_false

Thus we return the string "null" if the defaultData is equal to the object null, so we can print it to the console. Otherwise, we can safely use ToString since the result is not null.

We can then test our class.
struct TestStruct {
    int x;
};

enum TestEnum {
    x = 1, y = 2
};

delegate void TestDelegate();

interface TestInterface {}

static void Main(string[] args) {

    var structT = new Test < TestStruct > ();
    var enumT = new Test < TestEnum > ();
    var longT = new Test < long > ();
    var intT = new Test < int > ();
    var boolT = new Test < bool > ();
    var stringT = new Test < String > ();
    var classT = new Test < Class1 > ();
    var dynamicT = new Test < dynamic > ();
    var objectT = new Test < object > ();
    var delegateT = new Test < TestDelegate > ();
    var interfaceT = new Test < TestInterface > ();

    Console.WriteLine("default data for the following types");
    Console.WriteLine();
    Console.WriteLine("value types");
    Console.WriteLine();
    Console.WriteLine("struct: " + structT.GetDefaultData());
    Console.WriteLine("enum: " + enumT.GetDefaultData());
    Console.WriteLine("long: " + longT.GetDefaultData());
    Console.WriteLine("int: " + intT.GetDefaultData());
    Console.WriteLine("bool: " + boolT.GetDefaultData());
    Console.WriteLine();
    Console.WriteLine("reference types");
    Console.WriteLine();
    Console.WriteLine("string: " + stringT.GetDefaultData());
    Console.WriteLine("class: " + classT.GetDefaultData());
    Console.WriteLine("dynamic: " + dynamicT.GetDefaultData());
    Console.WriteLine("object: " + objectT.GetDefaultData());
    Console.WriteLine("delegate: " + delegateT.GetDefaultData());
    Console.WriteLine("interface: " + interfaceT.GetDefaultData());

    Console.ReadLine();
}

I'm not testing all the variations of int, float and decimal value types, because I'm fairly certain of that they all return 0.
And the results are:






















We see now that bool, which is a value type, returns False instead of 0 (makes sense...). And struct returns something weird (Main() lies in Program.cs in the ConsoleApplication1 namespace.)

If you are dissatisfied with the results, I have bad news. It is impossible to override default(T).

Wednesday, December 5, 2012

Learning Regular Expressions / Regex [Part 1] - Character Classes, Elements and Quantifiers


Please note: This syntax for Regex is for JavaScript, and the syntax for other languages may differ slightly!

Regular Expressions, or Regex for short, can be used to extract lists of simple or complex combinations of characters in a larger text.

From the text "otnhu,ch230 onte 2389]" we can, for example, get all digits, all words or the amount a single character is repeated (with some wrapper functionality in say C# or Java SE)

A very good tool for testing your Regex expressions is regexpal. Just paste your expression and test-data in the two text-boxes.

Take a look on regexpal and look to the right, on Quick Reference.

























It's a lot of info, so let us start simple.

Character classes


\d matches all digits. This equals to one character of the numbers from 0 to 9.
The result of '\d' can thus be substituted with '2' or '4' or '7' etc.
\d is called a character class

Regex: \d
Matches: hello 2 to34

Regex: \d\d
Matches: hello 2 to34

'\d' will give you list containing the three items '2','3' and '4'.

'\d\d' will give you list containing one item '34'.

Elements

All single characters, or substitutes such as \d, are elements. 
'2', 'a', 'B', '\d' are elements.
'22', 'AQ', 'TTT' '\d\d' are not elements.
Elements can also be constructed from many characters, using '(' and ')', for example (AB) is an element.

Quantifiers



Elements can have quantifiers, written after the element, which modifies it.

+


The '+' quantifier modifies the preceeding element to match one or more of itself.
So '+' after '\d' makes the '\d' repeat one or more times, until it reaches a non-digit.

Regex: \d+
Matches: my cute 235opossum is called 7777

As you see, the \d begins at '2' and the + quantifier makes \d repeat until '5'. Then it traverses the rest of the characters until '7' and ends at the last '7'.
Thus we get a list of two items containing '235' and '7777'.
So, as \d reaches 235 it becomes repeated three times '\d\d\d'. And upon 7777 it becomes repeated four times '\d\d\d\d'.

You can test this manually:

Regex: \d\d
Matches: my cute 235opossum is called 7777

Regex: \d\d\d
Matches: my cute 235opossum is called 7777

Remember that quantifiers can be attached to any element. '2' or 'A' is an element, as it's a single character.
So we can do the following:

Regex: 2
22 ham sandwiches 44 2

Regex: 2+
22 ham sandwiches 44 2 2222

There is a character between '2' and '2222', the whitespace ' ', so the regex sequence stops in between.

{ }


If you want to only match a set number of the same element, use '{' and '}'.

Regex: B{4}
BBB BBBB BBA BBAAABBBAABB AABBBBBBBAAA


This matches four B in succession.

Regex: B{1,3}
BBB BBBB BBA BBAAABBBAABB AABBBBBBBAAA

This matches between 1 and 3 B in succession.

Closing Comments


Now the power of Regex is shown with the combination of all these elements and modifiers.

Task: find two letters followed by two to three numbers
Regex: \w{2}\d{2,3}
oethu2390uN S<<o eeu9 34.R<>:Et<Eth;,go ogle.com23{H

Sunday, December 2, 2012

[C#] Minimizing and optimizing .js in .NET (pre-4.5) - Google Closure API

Google Closure helps you minimize and optimize multiple .JS files into one file.

You can turn this code

function hello(name) {
  alert('Hello, ' + name);
}
hello('New user');

Into this code

function hello(a){alert("Hello, "+a)}hello("New user");

To be noted: If you're using ASP.NET 4.5 we already have this functionality out of the box: http://weblogs.asp.net/scottgu/archive/2011/11/27/new-bundling-and-minification-support-asp-net-4-5-series.aspx

But I made a simple library for interacting with the Google Closure API

Source paste
.DLL
Project and test

Usage:

using GoogleClosureHelper;

ClosureHelper.InputFolder = "JSFolder";
ClosureHelper.OutputFile = "Compiled.js";
ClosureHelper.CompileToOutputPath("SIMPLE_OPTIMIZATIONS", "text", "compiled_code");

Look here for the parameters: https://developers.google.com/closure/compiler/docs/api-ref

Web interface: http://closure-compiler.appspot.com/home

Friday, November 30, 2012

[C#] DocX .NET introduction - Headers, footers and some body




DocX is a fantastic tool to create Microsoft Word-eligible documents, without actually having Word installed. This is because .docx objects are a zip-file, containing other objects of XML. You can see this for yourself by opening a .docx file with WinRar. The XML heritage creates great standalone functionality, which for example makes the code deployable on a webserver (without paying high prices for a Word license).

And to be honest, the original Word API (Look for Microsoft Word Object Library in COM in Visual Studio, if you have Word installed) is terrible. An absolute disaster to code and read. DocX will save you. 
The DocX library is it's extremely easy and straight-forward coding, making anyone able to understand and use it just after a few minutes of introduction.


Requirements
Visual Studio 2010
.NET 4.0

Do not make the mistake of pressing the fat, purple and inviting "Download" button - at the date of writing this article it links to an old version  (v.1.0.0.12), while the current, community-updated version is v.1.0.1.13 and contains essential and new methods such as proper support for merging cells in a table. The author seems to have taken a year-long hiatus, which is sad to see.

I have taken the liberty to host both the compiled (and updated to v.1.0.1.13) DLL for DocX, and a stripped-down project for you to compile.

v.1.0.1.13

And here is the link to the original source

But now, let's see some code. We are going to make a document with headers, footers, and some body-text.

Example code to get you started:
Single code: http://pastebin.ca/2258114 http://www.mediafire.com/?np6hhsphge334b8
Solution: http://www.mediafire.com/?lu7dt9unvf8llfm

Reference DocX in your project

using Novacode;

Then

String fileName = "MyDocXDocument.docx";
DocX document = DocX.Create(fileName);

The "document" object is now our textfile which can be used to append pictures, tables and text.

Paragraphs

Adding text is simple:

document.InsertParagraph("some text");

Paragraphs are also objects:

Paragraph p1 = document.InsertParagraph();
p1.Append("I'm appended to a paragraph object.");     

The Paragraph object has many methods. For example, you can use p1.InsertPicturep1.InsertPageBreakAfterSelfp1.InsertTable, and many more things.

In essense, paragraphs are an "anchor" which many things can be done to manipulate the page at the position it resides. Every paragraph is a line on the document, as if you would press Enter on a text you're writing. Thus the code must go in a sequential manner as you insert paragraphs in your document in a flowing manner: up to down.

Document

The "document" object has of course properties linked to how the document looks overall, such as

document.MarginBottom = 1f;
document.PageWidth = 800f;
document.Save();

You can even use " ctrl+f " functionality with document.ReplaceText

Headers and Footers

Headers and footers functions almost the same, with the difference of headers residing in the top of the document on every page, and footers in the bottom of every page.

The document object has a property called DifferentFirstPage which makes you able to use a unique header on the first page of your document, utilising the document.Headers.First object, making it not repeat on other pages. There is also the document.Headers.Even and document.Headers.Odd which manipulates pages with even and odd numbers respectively. However, there is no functionality to specifiy a header/footer on a specific page, probably due to limitations in the fundamental docx markup.

Self-explanatory code:

document.DifferentFirstPage = true;

Header firstHeader = document.Headers.first;
Header evenHeader = document.Headers.even;
Header oddHeader = document.Headers.odd;
firstHeader.InsertParagraph("I'm a paragraph at the first header, at page 0.");
evenHeader.InsertParagraph("I'm a paragraph at even headers, page 2, 4, 6 etc.");
oddHeader.InsertParagraph("I'm a paragraph at odd headers, page 1, 3, 5 etc.");

Please check the authors site for more and advanced examples

There is also a fairly active forum