ABy Admin
May 13'24

Exercise

Write a C program to generate a random integer array with a given length n , and sort it recursively using the Merge sort algorithm.

-sorting a one element array is easy.

- sorting two one-element arrays, requires the merge operation. The merge operation looks at two sorted arrays as lists, and compares the head of the list , and which ever head is smaller, this element is put on the sorted list and the head of that list is ticked off, so the next element becomes the head of that list. This is done until one of the lists is exhausted, and the other list is then copied onto the end of the sorted list.

- the recursion occurs, because merging two one-element arrays produces one two-element sorted array, which can be merged with another two-element sorted array produced the same way. This produces a sorted 4 element array, and the same applies for another 4 element sorted array.

- so the basic merge sort, is to check the size of list to be sorted, and if it is greater than one, divide the array into two, and call merge sort again on the two halves. After wards, merge the two halves in a temporary space of equal size, and then copy back the final sorted array onto the original array.

ABy Admin
May 13'24

One possible solution , after reading online descriptions of recursive merge sort, e.g. Dasgupta :

// to compile: gcc -Wall rmergesort.c -lm -o rmergesort

/*
 ============================================================================
 Name        : rmergesort.c
 Author      : Anon
 Version     : 0.1
 Copyright   : (C)2013 under CC-By-SA 3.0 License
 Description : Recursive Merge Sort, Ansi-style
 ============================================================================
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

//const int MAX = 200;
const int MAX = 20000000;

int *b;

int printOff = 0;

// this debugging print out of the array helps to show
// what is going on.
void printArray(char* label, int* a, int sz) {
	int h = sz/ 2;
	int i;

	if (printOff) return;

	printf("\n%s:\n", label);

	for (i = 0; i < h; ++i ) {

		printf("%d%c", a[i],
					// add in a newline every 20 numbers
					( ( i != 0 && i % 20 == 0 )? '\n': ' ' ) );
	}

	printf(" | ");
	for (;i < sz; ++i) {
		printf("%d%c", a[i],
					( ( i != 0 && i % 20 == 0 )? '\n': ' ' ) );
	}

	putchar('\n');


}

void mergesort(int* a, int m ) {

	printArray("BEFORE", a, m);

	if (m > 2) {
		// if greater than 2 elements, then recursive
		mergesort(a, m / 2);
		mergesort(a + m / 2, m - m / 2);


	} else if (m == 2 && a[0] > a[1]) {
		// if exactly 2 elements and need swapping, swap
		int t = a[1];
		a[1] = a[0];
		a[0] = t;
		goto end;

	}

	// 1 or greater than 2 elements which have been recursively sorted

	// divide the array into a left and right array
	// merge into the array b with index l.

	int n = m/2;
	int o = m - n;

	int i = 0, j = n;
	int l = 0;
	// i is left, j is right ;
	// l should equal m the size of the array
	while (i < n) {
		if ( j >= m) {
			// the right array has finished, so copy the remaining left array
			for(; i < n; ++i) {
				b[l++] = a[i];
			}

		// the merge operation is to copy the smaller current element and
		// increment the index of the parent sub-array.
		} else if(  a[i] < a[j] ) {
			b[l++] = a[i++];
		} else {
			b[l++] = a[j++];
		}
	}

	while ( j < m) {
		// copy the remaining right array , if any
		b[l++] = a[j++];
	}

	memcpy(a, b, sizeof(int) * l );

end:
	printArray("AFTER", a, m);

	return;

}

void rand_init(int* a, int n) {
	int i;
	for (i = 0; i < n; ++i ) {

		a[i] = rand() % MAX;

	}
}

int main(void) {
	puts("!!!Hello World!!!"); /* prints !!!Hello World!!! */

//	int N = 20;
//    int N = 1000;
//    int N = 1000000;
	int N = 100000000;  // still can't make a stack overflow on ubuntu,4GB, phenom
	printOff = 1;

	int *a;

	b = calloc( N, sizeof(int));

	a = calloc( N, sizeof(int));

	rand_init(a, N);

	mergesort(a, N);

	printOff = 0;

	printArray("LAST", a, N);

	free(a);
	free(b);

	return EXIT_SUCCESS;
}


/* Having failed to translate my concept of non-recursive merge sort,
 * I tackled the easier case of recursive merge sort.
 * The next task is to translate the recursive version to a non-recursive
 * version. This could be done by replacing calls to mergesort, with
 * pushes onto a stack of
 * tuples of ( <array start address>, <number of elements to process> )
 */

/* The central idea of merging, is that two sorted lists can be
 * merged into one sorted list, by comparing the top of each list and
 * moving the lowest valued element onto the end of the new list.
 *  The other list which has the higher valued element keeps its top
 *  element unchanged. When a list is exhausted, copy the remaining other list
 *  onto the end of the new list.
 */

/* The recursive part, is to defer any work in sorting an unsorted list,
 * by dividing it into two lists until there is only 1 or two elements,
 * and if there are two elements, sort them directly by swapping if
 * the first element is larger than the second element.
 *
 * After returning from a recursive call, merge the lists, which will
 * begin with one or two element sorted lists. The result is a sorted list
 * which will be returned to the parent of the recursive call, and can
 * be used for merging.
 */

/* The following is an imaginary discussion about what a programmer
 * might be thinking about when programming:
 *
 * Visualising recursion in terms of a Z80 assembly language, which
 * is similar to most assembly languages, there is a data stack (DS) and
 * a call stack (CS) pointer, and each recursive call to mergesort
 * pushes the return address , which is the program address of the instruction
 * after the call , onto the stack pointed to by CS and CS is incremented,
 * and the address of the array start and integer which is the subarray length
 * onto the data stack pointed to by DS, which will be incremented twice.
 *
 * If the number of recursive , active calls exceed the allowable space for either the call stack
 * or the data stack, then the program will crash , or a process space protection
 * violation interrupt signal will be sent by the CPU, and the interrupt vector
 * for that signal will jump the processor's current instruction pointer to the
 * interrupt handling routine.
 */

References

Wikibooks contributors. "C Programming/Exercise solutions". Wikibooks. Wikibooks. Retrieved 13 May 2024.

00