cothers - Tumblr blog

cothers · 4 years ago

Text

The Stack

When a program starts, it granted a fixed size of memory called the stack. Since all reasonable programming languages support recursive functions, the arguments and the local variables should be allocated on the stack to preserve their values during the execution.

The most famous function to present recursion is factorial(). Let's write yet another one. For our purposes, it outputs the addresses of the arguments at the standard output.

#include <stdio.h> #include <stdlib.h> double factorial(double n) { printf("%.0f %u\n", n, &n); if (n == 0) { return 1; } else { return n * factorial(n - 1); } } int main(int argc, char* argv[]) { double d = atof(argv[1]); printf("Factorial %f is %f\n", d, factorial(d)); return 0; }

argv[1] is the first argument we supply to our program, and atof() is the standard function that converts a string to a double-precision floating-point value.

When we run the program with argument "5" it outputs:

$ ./a.out 5 5 123147368 4 123147336 3 123147304 2 123147272 1 123147240 0 123147208 Factorial 5 is 120

As you can see, the address of the argument n goes backwards by 32 bytes in each iteration. Those 32 bytes area is called "stack frame". In addition to arguments and local variables, it stores the caller's address at code segment for knowing where to go when it is time to "return".

Let's do a silly thing and add a new local variable like the following;

double factorial(double n) { char s[1000]; ...

you can observe that the stack frame is bigger now:

$ ./a.out 5 5 244948200 4 244947160 3 244946120 2 244945080 1 244944040 0 244943000 Factorial 5 is 120

Stack segment is used with a technique called LIFO (last in first out) during the execution. Let's add a new function called termial(). The name termial is invented by the famous scientist & programmer & author of many books, Donald Knuth. It is an alternative to factorial() for using addition instead of multiplication.

... double termial(double n) { printf("%.0f %u\n", n, &n); if (n == 0) { return 0; } else { return n + termial(n - 1); } } int main(int argc, char* argv[]) { double d = atof(argv[1]); printf("&argc %u Factorial %.0f is %.0f\n", &argc, d, factorial(d)); printf("&argc %u Termial %.0f is %.0f\n", &argc, d, termial(d)); ...

As you can see, the stack addresses are reused during the separate calls for first the factorial(), then the termial():

$ ./a.out 5 5 214900440 4 214900408 3 214900376 2 214900344 1 214900312 0 214900280 &argc 214900476 Factorial 5 is 120 5 214900440 4 214900408 3 214900376 2 214900344 1 214900312 0 214900280 &argc 214900476 Termial 5 is 15

5 is a little number and works like a charm. But it silently eats the stack as the recursion goes deeper. In the above program, we added the address of the argc to mark where our stack started.

214900476 - 214900280 = 196

For the argument 5, 196 bytes of stack frames are used. What if we call our function with a bigger number like 1000:

$ ./a.out 1000 ... 7 339722472 6 339722440 5 339722408 4 339722376 3 339722344 2 339722312 1 339722280 0 339722248 &argc 339754284 Termial 1000 is 500500

The difference between 339754284 and 339722248 is 32036, about 32KB. It may seem a little, but once we want the result for 1 million;

$ ./a.out 1000000 ... 738237 2777516536 738236 2777516504 738235 2777516472 738234 2777516440 738233 2777516408 738232 2777516376 Segmentation fault (core dumped)

we reached the end of the stack and crashed because the addresses below the stack are unallocated. Let's do a little math again:

1000000 - 738232 = 261768 * 32 = 8376576

As you can see, 8,376,576 bytes are used for the stack in this scenario. Academics tend to over-teach recursion during their courses, and you are "stack-overflowed" at the most unfortunate time if it is overused. The end result is almost always "flattening" the algorithm like the following:

double factorial(double n) { int i; double result; if (n < 1) { return 1; // I don't care about negative numbers } result = 1; for (i = 1; i <= n; ++i) { result *= (double)i; } return result; }

Ugly, isn't it? But it only uses a few bytes of the stack and doesn't crash. It is also faster because it avoids function call overhead which includes arranging the stack frame and jumping to start of a function. Those operations may be cheap but not free.

The stack size is fixed. Why not a growable stack? Because in the real world, recursions may not be evident as in the factorial() and sometimes a mistake by the programmer caused an infinite recursion. If the stack is somehow made growable, it can eat all the RAM trying to store useless intermediate values, and the computer is grounded to a halt. Thanks to the fixed-size stack, that kind of faults crashes the program early without killing the system.

Operating systems tend to give a default stack size for the programs. It is also possible to control stack size by other means. The first method is telling the C compiler that our program needs more (or less) stack:

$ gcc -Wl,--stack,4194304 -o program program.c

This way, the "program" will request 4MB of stack space while running. It is also possible to change it during the runtime. For Linux, setrlimit() system function is used for this purpose:

struct rlimit rl; rl.rlim_cur = 16 * 1024 * 1024; // min stack size = 16 MB; result = setrlimit(RLIMIT_STACK, &rl);

Windows' Win32 API has SetThreadStackGuarantee() function for that purpose.

Modern languages aren't exempt from stack overflows. The following Java program;

public class Xyz { private static double factorial(double n) { if (n == 0) { return 1; } else { return n * factorial(n - 1); } } public static void main(String[] args) { try { System.out.println("Factorial " + factorial(1000000)); } catch (Error e) { System.out.println(e.getClass()); } } }

also crashes with the following:

class java.lang.StackOverflowError

Usually, in a catch block, it is best to write the stack trace. But for errors like stack overflow, the stack trace is also too big to dump as-is, so we only wrote the error class to prove that it is a StackOverflowError. In practice, exceptions are dumped into the log file, and as you can guess by now, it is possible to blow your log files by bad recursion. Be careful.

Tail Call Elimination

We criticized the overuse of recursion for good reasons except that academics are not dumb. They invented a technique called "tail call elimination" to prevent unnecessary stack usage during recursion. Let's review the line that returns by calling itself in the factorial() example:

return n * factorial(n - 1);

This is clearly the last line of execution for the function. A smart implementation may decide that we don't need the stack frame here anymore, rewind the stack frame and replace the values with those we computed during the execution.

Among the popular languages only Haskell, Scala and Lua support tail call elimination. Let's write the termial function in Lua:

function termial(x) if x == 0 then return 1 else return x * termial(x - 1) end end io.write("The result is ", termial(1000000, 1), "\n")

After running the program;

$ lua ornek.lua lua: ornek.lua:5: stack overflow stack traceback: ornek.lua:5: in function 'termial' ornek.lua:5: in function 'termial'

we still got a stack overflow. To detect the need for a tail call elimination, Lua requires that the return statement call only one function. So we rewrite the termial:

function termial(x, answer) if x == 0 then return answer else return termial(x - 1, x + answer) end end

As you can see, we reduced the return statement to a single function call while adding an extra parameter to the termial() (and making it uglier).

The result is 500000500001

Supporting tail call elimination is always a heated topic among language designers. Python's founder Guido van Rossum made the most famous comment against it by telling it's "Unpythonic" in 4 points:

...when a tail recursion is eliminated, there's no stack frame left to use to print a traceback when something goes wrong later.

...the idea that TRE is merely an optimization, which each Python implementation can choose to implement or not, is wrong. Once tail recursion elimination exists, developers will start writing code that depends on it, and their code won't run on implementations that don't provide it.

...to me, seeing recursion as the basis of everything else is just a nice theoretical approach to fundamental mathematics (turtles all the way down), not a day-to-day tool.

http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html

Multithreading

In the beginning, CPUs got faster each year. But for the last 10 or so years, they haven't got faster as quickly as once it was. In a conflicting trend, more speed is required from the hardware because of the Internet.

The industry found the solution in parallelization. More CPUs are added into the mainboards, and nowadays, the cheapest smartphone has at least 2 CPUs on it.

Each program has at least 1 thread of execution, and as we know that we have more CPUs on hand, we are encouraged to create more threads in our programs. Go programming language is created with that in mind providing first-class support for multithreading:

go f()

As you can see, it is as simple as using the "go" keyword to let the function run in another thread.

The bad news is that each thread of execution needs to have its own stack so we should take that into account while spawning many threads of execution because most languages and runtimes don't tell you much about that. In "go f()", we basically say

In C or operating system level, the stack size is taken on consideration while creating a thread. Let's see the prototype of the Win32's CreateThread() function:

HANDLE CreateThread( LPSECURITY_ATTRIBUTES lpThreadAttributes, SIZE_T dwStackSize, LPTHREAD_START_ROUTINE lpStartAddress, __drv_aliasesMem LPVOID lpParameter, DWORD dwCreationFlags, LPDWORD lpThreadId );

The second argument is the stack size for the thread. If you don't want to think much about it, you can just specify 0 and get as much stack your process requires. This is 1MB for most of the time, a rather massive number if you intend to create many of them. The official Win32 documentation recommends the opposite:

It is best to choose as small a stack size as possible and commit the stack needed for the thread or fiber to run reliably. Every page that is reserved for the stack cannot be used for any other purpose.

https://docs.microsoft.com/en-us/windows/win32/procthread/thread-stack-size

A thread may have a small stack segment size, but we will probably not know that while coding, especially in a big team. It is another reason to be safe and use the stack wisely.

To give modern programming languages their due, we should say that they thought hard about stack usage behind the scenes. In Go, only 2KB of stack space is reserved for their goroutines, and they grow dynamically as the program goes. Goroutines are carefully managed by Go runtime itself to avoid the possible lousy handling of threads in operating systems themselves.

Generation 0

Modern languages are most object-oriented, and they try hard to make everything object, including the strings. So it is hard to abuse their stack with a declaration like this:

char s[1000];

Most local variables do not belong to basic types like int, char, double etc., and they are allocated via the new operator. But this time the heap is abused because the burden of the stack is carried into the there. Since the heap is dynamic, to allocate and deallocate space are expensive operations. This is primarily a big problem in the early stages of the evolution of modern programming language runtimes.

The solution is found in a technique called generational garbage collection. When an object is created, it is stored in a stack-like memory space called generation 0. If the object's life-span is limited to the creator method, it is cheaply killed in there just like rewinding the stack frame. In reality, most object instances live and die this way.

To summarize, In practice, modern languages have a separate stack called "Generation 0" in .NET, "Eden" in Java, "youngest generation" in Python and so on...

0 notes

cothers · 4 years ago

Text

Segments of Memory

When we look at the RAM naively, it is an array of bytes addressed linearly. In practice, it is a sparse array where types of data are converged in specific addresses called segments. Each segment serves a different purpose for computer programs and is supported by the assembly/machine language, the CPUs' mother tongue.

But before diving into this, let's remember one of our computers' architectural principles, which may look old but still current.

Von Neumann Architecture

The earliest computers had fixed programs. They were like a calculator which knows how to calculate two numbers and nothing else. Changing what it does requires extensive redesign and rewiring of the system.

In 1944, John von Neumann was involved in the ENIAC project. He invented the "stored program" concept there in which, programs are loaded and stored in the main memory of the computer like any data. He wrote a paper titled "First Draft of a Report on the EDVAC". This paper was circulated among the computer builders and "Von Neumann Architecture" was born.

Since 1945, nearly all computers are built upon this idea, including the one we created our examples.

Types of memories in C

Let's start with a complete example which manifests all kinds of memory types in C:

#include <stdio.h> #include <stdlib.h> int g; char *s = "Hello"; int main(int argc, char* argv[]) { int n; int *p = malloc(1); // Stack Segment printf("&argc %u\n", &argc); printf("&n %u\n", &n); printf("&p %u\n", &p); // Code Segment printf("main %u\n", main); // Data segment printf("&g %u\n", &g); printf("&s %u\n", &s); printf("s %u\n", s); // Heap printf("p %u\n", p); return 0; }

The word "segment" attached to code, data and stack names comes from the native language CPUs called "machine language". The machine language identifies those memory areas and provides convenient and fast instructions to reach them through offsets. Programming languages including C mimics that with their structuring of programs.

The output lists the addresses of different types of variables.

&argc 1048287484 &n 1048287500 &p 1048287504 main 2981486985 &g 2981498908 &s 2981498896 s 2981490692 p 3008201376

While the numbers may look similar, you can observe some convergences between them. Let's start with the most divergent ones, the variables that reside on the stack segment:

&argc 1048287 484 0 &n 1048287 500 +16 &p 1048287 504 +20

We will observe the stack extensively later, but for now, we can simply say that all the variables defined as function parameters and local variables reside in the stack segment.

When the program is loaded from an executable file (like .exe, .dll, .so) into the memory, its instructions and constants are placed in the code segment.

main 29814 86985 s 29814 90692 +3707

Global pointer variable s is pointed to the constant "HELLO" in the code segment. We know it because when we try to manipulate it via the following code;

s[1] = 'A';

It crashes with the good old "Segmentation Fault" because the CPU prevents writing into memory marked as "code".

Every program has global variables which are stored in the data segment:

&s 29814 98896 +0 &g 29814 98908 +12

Stack, code and data segments occupy fixed sizes as defined by the program. But most programs need to dynamically allocate memory as they run. We call this memory "heap". When we run the above program, it requested 1 byte of memory via malloc(), and it is located a little bit far away from other segments.

(burada modern language'lere gel. Cok fazla sayida naif heap kullanimi ile baslayip, sonra generational'a donduklerinden falan bahset)(Thread stack)

Modern Languages' Take on Segments

Let's rewrite the above program in Java, without the printf()s since we can't take the address of variables there:

public class Xyz { public static int g; public static String s = "Hello"; private int i; public static void main(String[] args) { int n; Xyz p = new Xyz(); } }

In this program, the global variables are defined as static members in the Xyz since no variables outside the class scope permitted in Java. But they are real global variables, stored in the data segment and reachable as Xyz.g and Xyz.s anywhere throughout the program.

The variables args, n and p are located in the stack as in C. The non-static class attribute (or property) i can only be reachable if allocated on the heap via operator "new". Here, p is a reference and occupies a space in the stack, but the object instance it represents is allocated on the heap.

Heap is not Cheap

Modern programming languages tend to overuse heap because the only available means of creating a data structure requires using the "new" operator.

In C, you define a structure like this:

struct PERSON { char name[100]; char surname[100]; int age; }

This structure has fixed size (204 bytes in 32-bit systems) and can be used in the data segment, stack segment and heap. Let's define the same struct in Java:

class Person { String name; String surname; int age; }

This structure should be placed on the heap. Since string type is a class whenever a value is assigned to it, it is also allocated separately.

Allocation on the heap is not a cheap operation as in data and stack segments. Heap is invented because dynamic allocation is needed, and it comes with a price; for each request, the algorithm should search for an empty space. We'll explore heaps further in a separate article, but for now, this information is adequate.

C# provides a "struct" structure just for this purpose. Its standard class library uses structs for trivial data structures:

struct Point { int X; int Y; }

This is pragmatic and serves the purpose until strings are encountered.

using System; using System.Runtime.InteropServices; namespace ConsoleApp2 { struct Person { internal string name; internal string surname; internal int age; } class Program { static void Main(string[] args) { Person p; p.name = "Michael"; p.surname = "Jordan"; p.age = 55; Console.WriteLine("Person size " + Marshal.SizeOf(p)); } } }

The output would be "Person size 24" (or 12 in 32-bit systems), or 8 bytes per member because the string's content is not included in the calculation. Here "Michael" and "Jordan" are allocated on the data segment at program start just as in C, but in reality, values come from somewhere else and space from the heap should be given for them.

It is possible to allocate fixed-size structs in C# with some magic:

... [StructLayout(LayoutKind.Sequential)] struct Person { [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 100)] internal string name; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 100)] internal string surname; internal int age; } ...

We can get 204 as in C this way. </stdlib.h></stdio.h>

1 note · View note

cothers · 5 years ago

Text

Direct RAM Access

All programs operate on RAM regardless of its programming language, operating system or hardware, be it a smartphone, cloud, desktop, laptop or a machine as tiny as Arduino.

Nowadays, RAM is used as a synonym for main computer memory. In reality, it is an abbreviation for "Random-Access Memory". Here, random means not as in something random; the last thing you expect from a RAM is randomness. The connection between words "random" and "access" with the hyphen means any data in memory is accessed arbitrarily by the programs.

In this context, the opposite of random is sequential, which means the memory is accessed one-by-one as we can still see in SQL result-sets. In old times, memory hardware like "tape" was sequential; to access an arbitrary location, all previous locations on the band should be visited. The access types of memory hardware may seem outdated today as tapes are obsoleted as storage devices; almost all memory are "random" for the last 20 years.

Memory types like hard disks and CD/DVDs are randomly-accessed, but one thing separates RAM from the former is the fixed access time for each location. For disks, a device called "head" should be rotated so that access times may vary depending on the location. Nowadays, hard disks are rapidly replaced by SSDs (Solid State Drive) that can store as much data as disks and provide fixed access time as RAMs.

From now on, let's assume a RAM is a sea of bytes that is addressed by a number which is represented by the data type int.

In C, you can directly access RAM via its address. Let's start with an example:

#include <stdio.h> int main(int argc, char* argv[]) { char* p = 0; printf("%c\n", *p); return 0; }

Here, we start introducing the infamous "pointers" in C. Pointers are used like "references" in modern languages, but they really are memory addresses in C.

The term "char" (character) in the variable definition is the main mental block for understanding pointers while learning C. In reality, any type of variable definition with an asterisk (*) is always an int. The preceding type specifier is used to represent data in that address later.

The variable p is assigned the address 0, or in other words, the location of the very first byte in memory to p.

char *p = 0;

Here, *p represents the data in address p as char. We'll try to write it on the standard out (stdout) with printf():

printf("%c\n", *p);

When you run this program in a PC, it crashes with the following message:

Segmentation fault (core dumped)

In Windows, it pops a message box that says "Access Violation". This fault is similar to NullPointerException, NullReferenceException etc. in modern languages.

The above program crashes because the ability to reach an address doesn't mean the memory location it refers really exists. Because modern operating systems like Linux, Windows run many programs (or processes as they are called in that context) simultaneously, they should manage precious RAM resources carefully by a mechanism called "virtual memory" that often abbreviated as VM.

Modern systems would block writing to address 0 because it almost always means a programming fault like the exceptions mentioned above. But C is also used for embedded programming, which means writing programs for very tiny CPUs which helps electronic controller hardware. Those machines have constrained hardware resources so they may not have the luxury for controlling memory access. In this instance, reading the memory at 0 or an arbitrary location may not crash.

Now let's try to assign an arbitrary address to the p:

char *p = 5500;

The compiler would give a warning for that number.

warning: initialization makes pointer from integer without a cast

This error won't halt the compilation process. But as a wise programmer, we should not accumulate warnings. Thanks to "casting", it is possible to convince the compiler that we know what we are doing with a cast:

char *p = (char *)5500;

When you run the program, the result is again the segmentation fault. As you can see, C makes it possible to shoot yourself in the foot. But you are still lucky if you shoot yourself in the foot because at least you can go to a hospital. However, if that kind of error results in reading or writing from a legal memory, then your data integrity breaks and god knows where this error pops in the future.

Playing with Valid Memory Locations

Enough of crashes. Let's use some valid memory addresses, starting with the following example program:

#include <stdio.h> int main(int argc, char* argv[]) { char c = 'A'; printf("%u\n", &c); return 0; }

Here we define a variable of 1 byte (or 1 ASCII character) as c. It represents a location in RAM which stores a byte of data. The & operator takes the address of a variable, so the output is something like this:

1868946151

Let's play a little bit more and add any variable we've encountered throughout our little program:

printf("argc %u\n", &argc); printf("argv %u\n", &argv); printf("c %u\n", &c); printf("main %u\n", main);

It outputs something like this:

argc 1527215996 argv 1527215984 c 1527216007 main 448100010

As you can see, our main function's assembly code is located somewhere in our RAM.

Now let's play with them via a pointer:

char c = 'A'; char* p = &c; printf("c %c\n", c); printf("*p %c\n", *p); *p = 'B'; printf("c %c\n", c);

Here we define a pointer p and assign it to the address of c. So p becomes a reference to the c. The output is:

c A *p A c B

Now let's do something dangerous:

char* p = main; printf("%c\n", *p); *p = 'A';

This program has the ability to read and print the first character of the main() function but crashes when trying to write into it. Modern CPUs can distinguish between code and data and prevents overwriting the code.

U Segmentation fault (core dumped)

If you try the example above, you probably get warnings but, it doesn't stop compiling anyway.

To get even more dangerous, we will use a technique called the "pointer arithmetic".

char c1 = 'A'; char c2 = 'B'; char *p = &c1; printf("C1 %u %c\n", &c1, c1); printf("C2 %u %c\n", &c2, c2); p++; *p = 'Z'; printf("C2 %u %c\n", &c2, c2);

When you run this program, the output will be:

C1 69358686 A C2 69358687 B C2 69358687 Z

As you can see, the value of c2 is changed magically by a series of manipulations.

char *p = &c1;

We first assign pointer p to the address of c1.

p++; *p = 'Z'; printf("C2 %u %c\n", &c2, c2);

Remember, a pointer is actually an int that represents a memory address. Since it is an int, it is perfectly legal to increment it. By that, we can magically change the value of c2 without mentioning it.

Control is costly. Programs written in C are very fast because allowing direct manipulation of RAM avoids that cost.

Other Languages' Perspective on Accessing RAM

Most modern languages other than C and C++ prevent direct access to RAM. Instead, they control accessing it through carefully defined interfaces called "references". First, there is no notion of taking the address of a variable for basic types like char, int, double etc.

Second, object instances are stored in a variable of reference type. The reference is an opaque type; you can't take the address of it. They are used as-is. Of course, you can change the object which the reference points to, but you can't make it pointing an invalid arbitrary address like 5500 as we have given above.

Of course, object instances do live somewhere in RAM and in the 1990s references may leak that info when you convert them into a string. Today garbage collectors (GC) may move objects around the RAM for efficiency and heap defragmentation, so that info should contain something more independent than a mere memory address.

The following Java program creates two instances of class Object and converts them into a string:

public class Xyz { public static void main(String[] args) { Object o1 = new Object(); Object o2 = new Object(); System.out.println("O1 " + o1); System.out.println("O2 " + o2); } }

The outputs are some random hash values that uniquely identify the instance that is managed by the GC. As you can see, the hexadecimal numbers are unrelated.

O1 java.lang.Object@3af49f1c O2 java.lang.Object@19469ea2

One of the main design principles of modern languages is preventing pointer errors as you can see in the preceding paragraphs. As we said before, direct RAM manipulation is what makes C programming language very fast. However, most modern software doesn't need to be fast as much. Correctness is more valued since programmers are forced to deploy programs fast in this Internet era.

Zero, Null, Nil, Undefined

References can have a special value called null or nil or undefined when they do not point to any object instance. Let's make them fail by abusing them:

String s = null; System.out.println("Length " + s.length());

The result is the Java-style "Segmentation Fault".

Exception in thread "main" java.lang.NullPointerException at Xyz.main(Xyz.java:6)

Let me stress again. In a modern programing language, the reference may be null or something real, not in-between like in C/C++.

Some more modern languages like Kotlin (a Java dialect) go even further and prevent null value assignment if you don't specifically mark the reference with a question mark:

val String s = null; // Incorrect, don't compile var String? s = null; // Correct ```` ## Leaky Abstraction Operating Systems like Linux and Windows provide C APIs for their services. A modern programming language runtime should call that APIs at some point to do meaningful things like I/O and creating windows. Each of those languages provides some means of accessing C libraries and interacting with C so you can taste the pleasure of direct memory manipulation. For example, C# programming language and its associated runtime .NET provides "Interop" library to interface with the operating system. Interop library contains a big class called Marshal, which has many dirty and dangerous static methods against all OOP principles. For example, the following methods are available to read and write a byte to/from the RAM directly:

public static byte ReadByte(IntPtr ptr); public static void WriteByte(IntPtr ptr, byte val);

IntPtr type represents a C pointer. These methods are ready to throw an "AccessViolationException" when you do the same experiments as in the above paragraphs. But when you access some valid C memory by some means outside the scope of this topic, you can use them conveniently. Read/Write methods have other variants which allow accessing different basic types like int and blocks of byte arrays at once. Now, as always, let's do some naughty things:

using System; using System.Runtime.InteropServices;

namespace savas { class Program { static void Main(string[] args) { byte b = 33; GCHandle h = GCHandle.Alloc(b, GCHandleType.Pinned); IntPtr p = h.AddrOfPinnedObject(); byte b2 = Marshal.ReadByte(p); Console.WriteLine("b2 = " + b2);

} }

} ```

After defining the variable b, we "pinned" it so the GC won't move its place in memory. Then we get its address via AddrOfPinnedObject() method just like the & operator in C, read its value and print it. The output is "b2 = 33" as expected.

But if you call Marshal.WriteByte() to manipulate p, it doesn't write into b because once you pin the object, the connection between b and p is lost. This allows C# to stay pure because the Marshal class' memory methods are designed to manipulate buffers provided by the C libraries, not the other way around.

Python programming language has been written in C. At the same time, it provides a C interface that allows built-in classes and libraries be written in C. If that kind of classes supports a "buffer protocol", its raw bytes can be manipulated by memoryview class of Python. By default, Python's standard byte and bytearray objects support that protocol.

Without memoryview, Python-way of manipulating buffers is inefficient since any operation on arrays and slices requires creating a copy of the object. Using memoryview allows C-style direct control of memory in a controlled way; "best of both worlds" in certain scenarios.

1 note · View note