概述
关于虚拟内存的分配和管理的一些API
大概了解一下Unix系统中内存分配接口(memory allocation interfaces in Unix systems),这里先只是:
- 介绍2种内存管理的库调用
- 详细介绍堆上内存手动分配和释放函数(malloc/free)
问题的关键在于:
怎样分配和管理内存
1. 内存种类(Types of Memory)
In running a C program, there are two types of memory that are allocated.
- stack/automatic memory
The first is called stack memory, and allocations and deallocations
of it are managed implicitly by the compiler for you, the programmer; for
this reason it is sometimes called automatic memory.
Ex:
void func() {
int x; // declares an integer on the stack
...
}
The compiler does the rest, making sure to make space on the stack
when you call into func(). When you return from the function, the compiler deallocates thememory for you; thus, if you want some information to live beyond the call invocation, you had better not leave that information on the stack.
但是,随着调用func()函数结束,分配的动态分配的内存自动释放。所以,这里有个需求:
自己分配/释放内存,所以,就有了heap
- heap memory
heap memory, where all allocations and deallocations are explicitly handled by you.
Ex:
void func() {
int *x = (int *) malloc(sizeof(int));
...
}
both stack and heap allocation occur on this line:
First, the compiler knows to make room for a pointer to an integer when it sees your declaration of said pointer (int *x);
subsequently, when the program calls malloc(), it requests space for an integer on the heap;
the routine returns the address of such an integer (upon success, or NULL on failure), which is then stored on the stack for use by the program.
Because of its explicit nature, and because of its more varied usage,
heap memory presents more challenges to both users and systems. Thus,
it is the focus of the remainder of our discussion.
2. The malloc() Call
The malloc() call is quite simple: you pass it a size asking for some
room on the heap, and it either succeeds and gives you back a pointer to
the newly-allocated space, or fails and returns NULL.
对于NULL:
NULL in C isn’t really anything special at all, just a macro for the value zero.
Ex:
man malloc
#include <stdlib.h>
...
void *malloc(size_t size);
From this information, you can see that all you need to do is include
the header file stdlib.h to use malloc. In fact, you don’t really need to
even do this, as the C library, which all C programs link with by default, has the code for malloc() inside of it; adding the header just lets the compiler checkwhether you are calling malloc() correctly (e.g., passing the right number of arguments to it, of the right type).The single parameter malloc() takes is of type size t which simply
describes how many bytes you need. However, most programmers
do not type in a number here directly (such as 10); indeed, it would be
considered poor form to do so. Instead, various routines and macros are
utilized.
For example, to allocate space for a double-precision floating
point value, you simply do this:
double *d = (double *) malloc(sizeof(double));
关于sizeof()
This invocation of malloc() uses the sizeof() operator to request the right amount of space; in C, this is generally thought of as a compile-time operator, meaning that the actual size is known at compile time and thus a number (in this case, 8, for a double) is substituted as the argument to malloc(). For this reason, sizeof() is correctly thought of as an operator and not a function call
(a function call would take place at run time).
sizeof()是一个operator而非函数调用。
注意这样是不能得到正确结果的:
int *x = malloc(10 * sizeof(int));
printf("%dn", sizeof(x));
结果为4(32位系统);8(64位系统)
In the first line, we’ve declared space for an array of 10 integers, which is fine and dandy. However, when we use sizeof() in the next line,
it returns a small value, such as 4 (on 32-bit machines) or 8 (on 64-bit
machines).
原因是sizeof()只是简单的认为其大小为一个指向int型的指针的大小,在(mac中,sizeof(int *) == 8), 所以并不是计算指针x所指向地址的那个大小。
The reason is that in this case, sizeof() thinks we are simply
asking how big a pointer to an integer is, not how much memory we
have dynamically allocated.
这样的话,可以得到想要的结果:
int x[10];
printf("%dn", sizeof(x));
这里,结果为40
In this case, there is enough static information for the compiler to
know that 40 bytes have been allocated.
综上,给出在mac下运行代码片段和结果:
代码:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
printf("sizeof(x) = %lun", sizeof(x));
int *y = (int *)malloc(10 * sizeof(int));
printf("sizeof(10*sizeof(int)) = %lun", sizeof(y));
int z[10];
printf("sizeof(z[10]) = %lun", sizeof(z));
return 0;
}
结果:
sizeof(x) = 4
sizeof(10*sizeof(int)) = 8
sizeof(z[10]) = 40
Another place to be careful is with strings.
在strings中,调用malloc()申请内存,计算字符串大小时,不能用sizeof()这个operator了,而是用strlen(s)+1来申请:
When declaring space for a string, use the following idiom:
malloc(strlen(s) + 1)
, which gets the length of the string using the function strlen(), and adds 1 to it in order to make room for the end-of-string character. Using sizeof() may lead to trouble here.
关于malloc()函数的返回结果:
void *malloc(size_t);
You might also notice that malloc() returns a pointer to type void.
Doing so is just the way in C to pass back an address and let the programmer decide what to do with it. The programmer further helps out
by using what is called a cast.
关于int *x = (int *)malloc(sizeof(int));
中,在malloc函数前的(int *)声明即为cast,这个cast没什么特别的作用,只是告诉编译器或者其他编程人员。
Casting doesn’t really accomplish anything, other than tell the compiler and other programmers who might be reading your code: “yeah, I know what I’m doing.” By casting the result of malloc(), the programmer is just giving some reassurance; the cast is not needed for the correctness.
3. The free() Call
man free
void free(void *ptr);
当申请的内存不再使用时候,记得要释放。
As it turns out, allocating memory is the easy part of the equation;
knowing when, how, and even if to free memory is the hard part. To free
heap memory that is no longer in use.
通常只是简单一个调用free():
int *x = malloc(10 * sizeof(int));
...
free(x);
这里free()只需要将申请的变量名称free一下就可以。
The routine takes one argument, a pointer thatwas returned by malloc(). Thus, you might notice, the size of the allocated region is not passed in by the user, and must be tracked by the memory-allocation library itself.
关于一些常见的内存管理错误(Common Errors),在之后记录在这里。
4. Common Errors
在内存分配和管理中,会有很多错误。
There are a number of common errors that arise in the use of malloc()
and free().
所以,在“高级”编程语言中,通常会有内存自动管理和垃圾回收机制。
Correct memory management has been such a problem, in fact, that
many newer languages have support for automatic memory management. In such languages, while you call something akin to malloc() to allocate memory (usually new or something similar to allocate a new object), you never have to call something to free space; rather, a garbage collector runs and figures out what memory you no longer have references to and frees it for you.
下面,列出常见的错误:
Forgetting To Allocate Memory
segmentation fault
char *src = "hello";
char *dst; // oops! unallocated
strcpy(dst, src); // segfault and die
正确的方式是这样的:
char *src = "hello";
char *dst = (char *) malloc(strlen(src) + 1);
strcpy(dst, src); // work properly
另外一个稍微好点儿了函数strdup()
自行man strdup
Not Allocating Enough Memory
buffer overflow
char *src = "hello";
char *dst = (char *) malloc(strlen(src)); // too small!
strcpy(dst, src); // work properly
another valuable lesson: even though it ran correctly once, doesn’t mean it’s correct.
Forgetting to Initialize Allocated Memory
uninitialized read
With this error, you call malloc() properly, but forget to fill in some values
into your newly-allocated data type.
Forgetting To Free Memory
memory leak
在一个长时间跑的程序中,忘记free memory会导致灾难性的后果。
In long-running applications or systems (such as the OS itself), this is a huge problem, as slowly leaking memory eventually leads one to run out ofmemory, at which point a restart is required.
Note that using a garbage-collected language doesn’t help here: if you still have a reference to some chunk of memory, no garbage collector will ever free it, and thus memory leaks remain a problem even in more modern languages.
在一个较小的程序中,貌似不用free也可以,因为程序很快就结束了,从而内存被迫释放了。但是,这样的习惯肯定是不好的,要记得手动释放内存。
In some cases, it may seem like not calling free() is reasonable. For
example, your program is short-lived, and will soon exit; in this case,
when the process dies, the OS will clean up all of its allocated pages and thus no memory leak will take place per se.
Freeing Memory Before You Are Done With It
dangling pointer
The subsequent use can crash the program, or overwrite valid memory (e.g., you called free(), but then called malloc() again to allocate something else, which then recycles the errantly-freed memory).
Freeing Memory Repeatedly
double free
The result of doing so is undefined. As you can imagine, the memory-allocation library might get confused and do all sorts of weird things; crashes are a common outcome.
Calling free() Incorrectly
invalid frees
One last problem we discuss is the call of free() incorrectly. After all, free() expects you only to pass to it one of the pointers you received from malloc() earlier. When you pass in some other value, bad things can (and do) happen.
为什么在你进程结束时候,会终止内存泄漏?
WHY NO MEMORY IS LEAKED ONCE YOUR PROCESS EXITS
原因很简单:两级内存管理
The reason is simple: there are really two levels of memory management in the system.
操作系统层级
进程层级
The first level of memory management is performed by the OS, which
hands out memory to processes when they run, and takes it back when
processes exit (or otherwise die). The second level of management
is within each process, for example within the heap when you call
malloc() and free(). Even if you fail to call free() (and thus leak
memory in the heap), the operating systemwill reclaimall thememory of
the process (including those pages for code, stack, and, as relevant here, heap) when the program is finished running. No matter what the state of your heap in your address space, the OS takes back all of those pages when the process dies, thus ensuring that no memory is lost despite the fact that you didn’t free it.
所以,在进程层面,申请内存后没有释放,当进程结束时候,操作系统把该进程的所有信息都从内存中清除掉,所以,进程级的内存泄漏在此时没有问题。
Thus, for short-lived programs, leakingmemory often does not cause any
operational problems (though it may be considered poor form). When you write a long-running server (such as a web server or database management system, which never exit), leaked memory is a much bigger issue, and will eventually lead to a crash when the application runs out of memory.
但是,OS级别的内存泄漏那就灾难性了。
And of course, leaking memory is an even larger issue inside one particular program: the operating system itself. Showing us once again: those who write the kernel code have the toughest job of all…
5. Underlying OS Support
malloc()/free()不是系统调用,而是库调用
You might have noticed that we haven’t been talking about system calls when discussing malloc() and free(). The reason for this is simple: they are not system calls, but rather library calls.
所以,malloc库调用在虚拟内存地址空间中,但是,它本身是建立在一些系统调用之上的,这些系统调用会进入OS,向系统申请或者释放地址。
Thus the malloc library manages space within your virtual address space, but itself is built on top of some system calls which call into the OS to ask for more memory or release some back to the system.
其中,关于系统调用system call有brk, sbrk.
One such system call is called brk, which is used to change the location of the program’s break: the location of the end of the heap. It takes one argument (the address of the new break), and thus either increases or decreases the size of the heap based on whether the new break is larger or smaller than the current break. An additional call sbrk is passed an increment but otherwise serves a similar purpose.
但是,不建议直接使用这两个系统调用,但容易出问题。
Note that you should never directly call either brk or sbrk. They
are used by the memory-allocation library; if you try to use them, you
will likely make something go (horribly) wrong. Stick to malloc() and
free() instead.
还可以使用mmap()调用,从操作系统中申请内存。
Finally, you can also obtainmemory from the operating systemvia the
mmap() call. By passing in the correct arguments, mmap() can create an
anonymousmemory regionwithin your program—a regionwhich is not associated with any particular file but rather with swap space, something we’ll discuss in detail later on in virtual memory. This memory can then also be treated like a heap and managed as such. Read the manual page of mmap() for more details.
6. Other Calls
还有一些其他的内存申请库调用, 如calloc(), realloc()
There are a few other calls that the memory-allocation library supports. For example, calloc() allocates memory and also zeroes it before returning; this prevents some errors where you assume that memory
is zeroed and forget to initialize it yourself (see the paragraph on “uninitialized reads” above). The routine realloc() can also be useful, when you’ve allocated space for something (say, an array), and then need to add something to it: realloc() makes a new larger region of memory,
copies the old region into it, and returns the pointer to the new region.
7. Summary
推荐两本枕边书:
“The C Programming Language”
Brian Kernighan and Dennis Ritchie
Prentice-Hall 1988
“Advanced Programming in the UNIX Environment”
W. Richard Stevens and Stephen A. Rago
Addison-Wesley, 2005
之后文章会记录一些编程试验题目。
最后
以上就是朴实哈密瓜为你收集整理的虚拟内存分配和管理接口(VM allocation interfaces)1. 内存种类(Types of Memory)2. The malloc() Call3. The free() Call4. Common Errors5. Underlying OS Support6. Other Calls7. Summary的全部内容,希望文章能够帮你解决虚拟内存分配和管理接口(VM allocation interfaces)1. 内存种类(Types of Memory)2. The malloc() Call3. The free() Call4. Common Errors5. Underlying OS Support6. Other Calls7. Summary所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复