C语言编程

C is the de-facto programming language to do serious system serious programming. Why? Most kernels have their API accessible through C. The Linux kernel (Love #ref-Love) and the XNU kernel (Inc. #ref-xnukernel) of which MacOS is based on are written in C and have C API - Application Programming Interface. The Windows Kernel uses C++, but doing system programming on that is much harder on windows that UNIX for novice system programmers. C doesn’t have abstractions like classes and Resource Acquisition Is Initialization (RAII) to clean up memory. C also gives you much more of an opportunity to shoot yourself in the foot, but it lets you do things at a much more fine-grained level.

History of C

C was developed by Dennis Ritchie and Ken Thompson at Bell Labs back in 1973 (Ritchie #ref-Ritchie:1993:DCL:155360.155580). Back then, we had gems of programming languages like Fortran, ALGOL, and LISP. The goal of C was two-fold. Firstly, it was made to target the most popular computers at the time, such as the PDP-7. Secondly, it tried to remove some of the lower-level constructs (managing registers, and programming assembly for jumps), and create a language that had the power to express programs procedurally (as opposed to mathematically like LISP) with readable code. All this while still having the ability to interface with the operating system. It sounded like a tough feat. At first, it was only used internally at Bell Labs along with the UNIX operating system.

The first “real” standardization was with Brian Kernighan and Dennis Ritchie’s book (Kernighan and Ritchie #ref-kernighan1988c). It is still widely regarded today as the only portable set of C instructions. The K&R book is known as the de-facto standard for learning C. There were different standards of C from ANSI to ISO, though ISO largely won out as a language specification. We will be mainly focusing on is the POSIX C library which extends ISO. Now to get the elephant out of the room, the Linux kernel is fails to be POSIX compliant. Mostly, this is so because the Linux developers didn’t want to pay the fee for compliance. It is also because they did not want to be fully compliant with a multitude of different standards because that meant increased development costs to maintain compliance.

We will aim to use C99, as it is the standard that most computers recognize, but sometimes use some of the newer C11 features. We will also talk about some off-hand features like getline because they are so widely used with the GNU C library. We’ll begin by providing a fairly comprehensive overview of the language with language facilities. Feel free to gloss over if you have already worked with a C based language.

Features

  • Speed. There is little separating a program and the system.

  • Simplicity. C and its standard library comprise a simple set of portable functions.

  • Manual Memory Management. C gives a program the ability to manage its memory. However, this can be a downside if a program has memory errors.

  • Ubiquity. Through foreign function interfaces (FFI) and language bindings of various types, most other languages can call C functions and vice versa. The standard library is also everywhere. C has stood the test of time as a popular language, and it doesn’t look like it is going anywhere.

Crash course introduction to C C语言入门课程介绍

The canonical way to start learning C is by starting with the hello world program. The original example that Kernighan and Ritchie proposed way back when hasn’t changed.

学习 C 的正宗方法是从 “Hello, World!” 程序开始。K&R 在很久以前提出的原始示例并没有改变。

#include <stdio.h>
int main(void) {
  printf("Hello World\n");
  return 0;
}
  1. The #include directive takes the file stdio.h (which stands for standard input and output) located somewhere in your operating system, copies the text, and substitutes it where the #include was. #include 指令会将 stdio.h 文件 (代表标准输入和输出) 从你的操作系统中某处复制到代码中,并在需要时将其替换为代码中的位置。

  2. The int main(void) is a function declaration. The first word int tells the compiler the return type of the function. The part before the parenthesis (main) is the function name. In C, no two functions can have the same name in a single compiled program, although shared libraries may be able. Then, the parameter list comes after. When we provide the parameter list for regular functions (void) that means that the compiler should produce an error if the function is called with a non-zero number of arguments. For regular functions having a declaration like void func() means that the function can be called like func(1, 2, 3), because there is no delimiter. main is a special function. There are many ways of declaring main but the standard ones are int main(void), int main(), and int main(int argc, char *argv[]). int main(void) 是一个函数声明。第一个单词 int 告诉编译器函数的返回类型。括号内的 (main) 是函数名。在 C 中,单个编译程序中不允许两个函数具有相同的名称,尽管共享库可能允许。接下来是参数列表。当我们为普通函数提供参数列表时 (void),这意味着编译器应该在函数调用时产生错误,如果函数被调用时传递非零的参数。对于普通函数,具有 void func() 声明意味着函数可以像 func(1, 2, 3) 一样调用,因为函数声明没有明确的分隔符。main 是一个特殊函数。有许多方式声明 main,但标准方式是 int main(void)、int main() 和 int main(int argc, char *argv[])。

  3. printf("Hello World"); is what a function call. printf is defined as a part of stdio.h. The function has been compiled and lives somewhere else on our machine - the location of the C standard library. Just remember to include the header and call the function with the appropriate parameters (a string literal "Hello World"). If the newline isn’t included, the buffer will not be flushed (i.e. the write will not complete immediately). printf(“Hello World”); 是函数调用。printf 是 stdio.h 的一部分定义。函数已经编译并在我们的机器上其他位置运行——C 标准库的位置。只需记住包含头文件并使用适当的参数 (一个字符串常量“Hello World”) 调用函数。如果未包含换行符,缓冲区不会被 flushed(即写入不会立即完成)。

  4. return 0. main has to return an integer. By convention, return 0 means success and anything else means failure. Here are some exit codes / statuses with special meaning: http://tldp.org/LDP/abs/html/exitcodes.html. In general, assume 0 means success. return 0。main 必须返回一个整数。通常,返回 0 表示成功,其他值表示失败。这里有一些特殊的退出码/状态:http://tldp.org/LDP/abs/html/exitcodes.html。通常,假设 0 表示成功。

$ gcc main.c -o main
$ ./main
Hello World
$
  1. gcc is short for the GNU Compiler Collection which has a host of compilers ready for use. The compiler infers from the extension that you are trying to compile a .c file. gcc 是 GNU 编译器集合的缩写,该集合中有许多可用的编译器。编译器从文件后缀推断出您正在尝试编译 .c 文件。

  2. ./main tells your shell to execute the program in the current directory called main. The program then prints out “hello world”. ./main 告诉您的 shell 在当前目录中运行名为 main 的程序。该程序随后打印出“hello world”。

If systems programming was as easy as writing hello world though, our jobs would be much easier. 尽管系统编程并不像写“hello world”一样简单,但如果您能像写“hello world”一样容易地编写系统程序,那么您的工作就会更轻松。

Preprocessor 预处理器

What is the preprocessor? Preprocessing is a copy and paste operation that the compiler performs before actually compiling the program. The following is an example of substitution. 什么是预处理器?预处理是在编译程序之前执行的复制和粘贴操作。下面是一个例子,展示了预处理的替换操作。

// Before preprocessing
#define MAX_LENGTH 10
char buffer[MAX_LENGTH]

// After preprocessing
char buffer[10]

There are side effects to the preprocessor though. One problem is that the preprocessor needs to be able to tokenize properly, meaning trying to redefine the internals of the C language with a preprocessor may be impossible. Another problem is that they can’t be nested infinitely - there is a bounded depth where they need to stop. Macros are also simple text substitutions, without semantics. For example, look at what can happen if a macro tries to perform an inline modification. 然而,预处理器也有一些副作用。一个问题是,预处理器需要能够正确地分词,这意味着尝试使用预处理器重新定义 C 语言的内部结构可能是不可能实现的。另一个问题是,它们不能无限嵌套——它们需要停止的有限深度。宏也是一种简单的文本替换,没有语义。例如,看看如果宏尝试进行内联修改会发生什么。

#define min(a,b) a < b ? a : b
int main() {
  int x = 4;
  if(min(x++, 5)) printf("%d is six", x);
  return 0;
}

Macros are simple text substitution so the above example expands to 宏是简单的文本替换,所以上面的例子扩展为:

x++ < 5 ? x++ : 5

In this case, it is opaque what gets printed out, but it will be 6. Can you try to figure out why? Also, consider the edge case when operator precedence comes into play. 在这种情况下,输出的内容是不透明的,但一定会输出 6。你能尝试找出为什么吗?此外,应考虑运算符优先级引起的边界情况。

int x = 99;
int r = 10 + min(99, 100); // r is 100!
// This is what it is expanded to
int r = 10 + 99 < 100 ? 99 : 100
// Which means
int r = (10 + 99) < 100 ? 99 : 100

There are also logical problems with the flexibility of certain parameters. One common source of confusion is with static arrays and the sizeof operator. 某些参数的灵活性也可能导致逻辑问题。一个常见的混淆来源是对静态数组和 sizeof 运算符的操作。

#define ARRAY_LENGTH(A) (sizeof((A)) / sizeof((A)[0]))
int static_array[10]; // ARRAY_LENGTH(static_array) = 10
int* dynamic_array = malloc(10); // ARRAY_LENGTH(dynamic_array) = 2 or 1 consistently

What is wrong with the macro? Well, it works if a static array is passed in because sizeof a static array returns the number of bytes that array takes up and dividing it by the sizeof(an_element) would give the number of entries. But if passed a pointer to a piece of memory, taking the sizeof the pointer and dividing it by the size of the first entry won’t always give us the size of the array. 宏有什么问题?如果传入一个静态数组,宏就能正常工作,因为 sizeof 静态数组返回数组占用的字节数,将其除以 sizeof(an_element) 即可得到数组元素个数。但如果传入一个指向内存的指针,计算指针的大小并除以第一个元素的的大小并不一定能得出数组的大小。

Language Facilities 语言基础

Keywords 关键词

C has an assortment of keywords. Here are some constructs that you should know briefly as of C99.

  1. break is a keyword that is used in case statements or looping statements. When used in a case statement, the program jumps to the end of the block.
switch(1) {
  case 1: /* Goes to this switch */
  puts("1");
  break; /* Jumps to the end of the block */
  case 2: /* Ignores this program */
  puts("2");
  break;
} /* Continues here */

In the context of a loop, using it breaks out of the inner-most loop. The loop can be either a for, while, or do-while construct

while(1) {
  while(2) {
    break; /* Breaks out of while(2) */
  } /* Jumps here */
  break; /* Breaks out of while(1) */
} /* Continues here */
  1. const is a language level construct that tells the compiler that this data should remain constant. If one tries to change a const variable, the program will fail to compile. const works a little differently when put before the type, the compiler re-orders the first type and const. Then the compiler uses a left associativity rule. Meaning that whatever is left of the pointer is constant. This is known as const-correctness.
const int i = 0; // Same as "int const i = 0"
char *str = ...; // Mutable pointer to a mutable string
const char *const_str = ...; // Mutable pointer to a constant string
char const *const_str2 = ...; // Same as above
const char *const const_ptr_str = ...;
// Constant pointer to a constant string

But, it is important to know that this is a compiler imposed restriction only. There are ways of getting around this, and the program will run fine with defined behavior. In systems programming, the only type of memory that you can’t write to is system write-protected memory.

const int i = 0; // Same as "int const i = 0"
(*((int *)&i)) = 1; // i == 1 now
const char *ptr = "hi";
*ptr = ’\0; // Will cause a Segmentation Violation
  1. continue is a control flow statement that exists only in loop constructions. Continue will skip the rest of the loop body and set the program counter back to the start of the loop before.
int i = 10;
while(i--) {
  if(1) continue; /* This gets triggered */
  *((int *)NULL) = 0;
} /* Then reaches the end of the while loop */
  1. do {} while(); is another loop construct. These loops execute the body and then check the condition at the bottom of the loop. If the condition is zero, the next statement is executed – the program counter is set to the first instruction after the loop. Otherwise, the loop body is executed.
int i = 1;
do {
  printf("%d\n", i--);
} while (i > 10) /* Only executed once */
  1. enum is to declare an enumeration. An enumeration is a type that can take on many, finite values. If you have an enum and don’t specify any numerics, the C compiler will generate a unique number for that enum (within the context of the current enum) and use that for comparisons. The syntax to declare an instance of an enum is enum <type> varname. The added benefit to this is that the compiler can type check these expressions to make sure that you are only comparing alike types. enum 是用来声明枚举的。枚举是一种类型,可以拥有许多有限的值。如果你有一个枚举,但没有指定任何数字,C 编译器将为该枚举生成一个唯一的数字 (在当前枚举上下文中),并使用这个数字来进行比较。声明枚举实例的语法是enum <类型> 变量名。这样做的好处是编译器可以对这些表达式进行类型检查,以确保你只能比较相似的类型。
enum day{ monday, tuesday, wednesday,
  thursday, friday, saturday, sunday};

void process_day(enum day foo) {
  switch(foo) {
  case monday:
    printf("Go home!\n"); break;
  // ...
  }
}

It is completely possible to assign enum values to either be different or the same. It is not advisable to rely on the compiler for consistent numbering, if you assign numbers. If you are going to use this abstraction, try not to break it. 完全有可能为枚举值分配不同的或相同的值。如果你为枚举分配数字,不建议依赖编译器进行一致的编号。如果你打算使用这种抽象,尽量不要破坏它。

enum day{
  monday = 0,
  tuesday = 0,
  wednesday = 0,
  thursday = 1,
  friday = 10,
  saturday = 10,
  sunday = 0};

void process_day(enum day foo) {
  switch(foo) {
  case monday:
    printf("Go home!\n"); break;
  // ...
  }
}
  1. extern is a special keyword that tells the compiler that the variable may be defined in another object file or a library, so the program compiles on missing variable because the program will reference a variable in the system or another file. extern 是一个特殊的关键字,它告诉编译器变量可能定义在另一个对象文件或库中。因此,当程序缺少变量时,仍然可以编译,因为程序将引用系统或另一个文件中的变量。
// file1.c
extern int panic;
void foo() {
  if (panic) {
    printf("NONONONONO");
  } else {
    printf("This is fine");
  }
}

//file2.c
int panic = 1;
  1. for is a keyword that allows you to iterate with an initialization condition, a loop invariant, and an update condition. This is meant to be equivalent to a while loop, but with differing syntax.
for (initialization; check; update) {
//...
}
// Typically
int i;
for (i = 0; i < 10; i++) {
//...
}

As of the C89 standard, one cannot declare variables inside the for loop initialization block. This is because there was a disagreement in the standard for how the scoping rules of a variable defined in the loop would work. It has since been resolved with more recent standards, so people can use the for loop that they know and love today

for(int i = 0; i < 10; ++i) {

The order of evaluation for a for loop is as follows:

  • (a) Perform the initialization statement.
  • (b) Check the invariant. If false, terminate the loop and execute the next statement. If true, continue to the body of the loop.
  • (c) Perform the body of the loop.
  • (d) Perform the update statement.
  • (e) Jump to checking the invariant step.
  1. goto is a keyword that allows you to do conditional jumps. Do not use goto in your programs. The reason being is that it makes your code infinitely more hard to understand when strung together with multiple chains, which is called spaghetti code. It is acceptable to use in some contexts though, for example, error checking code in the Linux kernel. The keyword is usually used in kernel contexts when adding another stack frame for cleanup isn’t a good idea. The canonical example of kernel cleanup is as below.
void setup(void) {
  Doe *deer;
  Ray *drop;
  Mi *myself;

  if (!setupdoe(deer)) {
    goto finish;
  }

  if (!setupray(drop)) {
    goto cleanupdoe;
  }

  if (!setupmi(myself)) {
    goto cleanupray;
  }

  perform_action(deer, drop, myself);
  cleanupray:
  cleanup(drop);
  cleanupdoe:
  cleanup(deer);
  finish:
  return;
}
  1. if else else-if are control flow keywords. There are a few ways to use these (1) A bare if (2) An if with an else (3) an if with an else-if (4) an if with an else if and else. Note that an else is matched with the most recent if. A subtle bug related to a mismatched if and else statement, is the dangling else problem. The statements are always executed from the if to the else. If any of the intermediate conditions are true, the if block performs that action and goes to the end of that block.
// (1)
if (connect(...))
  return -1;
// (2)
if (connect(...)) {
  exit(-1);
} else {
  printf("Connected!");
}
// (3)
if (connect(...)) {
  exit(-1);
} else if (bind(..)) {
  exit(-2);
}
// (1)
if (connect(...)) {
  exit(-1);
} else if (bind(..)) {
  exit(-2);
} else {
  printf("Successfully bound!");
}
  1. inline is a compiler keyword that tells the compiler it’s okay to moit the C function call procedure and “paste” the code in the callee. Instead, the compiler is hinted at substituting the function body directly into the calling function. This is not always recommended explicitly as the compiler is usually smart enough to know when to inline a function for you.
inline int max(int a, int b) {
  return a < b ? a : b;
}

int main() {
  printf("Max %d", max(a, b));
  // printf("Max %d", a < b ? a : b);
}
  1. restrict is a keyword that tells the compiler that this particular memory region shouldn’t overlap with all other memory regions. The use case for this is to tell users of the program that it is undefined behavior if the memory regions overlap. Note that memcpy has undefined behavior when memory regions overlap. If this might be the case in your program, consider using memmove. “restrict” 是一个关键字,它告诉编译器,这个特定的内存区域不应该与其他内存区域重叠。这个用法的目的是为了告诉程序用户,如果内存区域重叠,则其行为将不可定义。需要注意的是,当内存区域重叠时,“memcpy” 函数的行为将不可定义。如果你的程序中可能存在这种情况,可以考虑使用 “memmove” 函数。
memcpy(void * restrict dest, const void* restrict src, size_t
bytes);

void add_array(int *a, int * restrict c) {
  *a += *c;
}
int *a = malloc(3*sizeof(*a));
*a = 1; *a = 2; *a = 3;
add_array(a + 1, a) // Well defined
add_array(a, a) // Undefined
  1. return is a control flow operator that exits the current function. If the function is void then it simply exits the functions. Otherwise, another parameter follows as the return value. return 是一种控制流运算符,用于退出当前函数。如果函数是 void,则只需退出函数即可。否则,作为返回值的另一个参数紧随其后。
void process() {
  if (connect(...)) {
    return -1;
  } else if (bind(...)) {
    return -2;
  }
  return 0;
}
  1. signed is a modifier which is rarely used, but it forces a type to be signed instead of unsigned. The reason that this is so rarely used is because types are signed by default and need to have the unsigned modifier to make them unsigned but it may be useful in cases where you want the compiler to default to a signed type such as below. signed 是一个修饰符,虽然很少使用,但它强制类型成为有符号类型,而不是无符号类型。很少使用的原因是类型默认是有符号的,需要使用 unsigned 修饰符才能使它们成为无符号类型。但是,在某些情况下,你可能希望编译器默认为有符号类型,例如以下情况:
int count_bits_and_sign(signed representation) {
//...
}
  1. sizeof is an operator that is evaluated at compile-time, which evaluates to the number of bytes that the expression contains. When the compiler infers the type the following code changes as follows. sizeof 是一个编译时运算符,用于计算表达式中所包含的字节数。当编译器推断类型时,以下代码会发生变化:
char a = 0;
printf("%zu", sizeof(a++));

char a = 0;
printf("%zu", 1);

Which then the compiler is allowed to operate on further. The compiler must have a complete definition of the type at compile-time - not link time - or else you may get an odd error. Consider the following: 编译器必须在编译时获得类型的完整定义,而不是在链接时,否则可能会出现奇怪的错误。考虑以下代码:

// file.c
struct person;
printf("%zu", sizeof(person));

// file2.c
struct person {
// Declarations
}

This code will not compile because sizeof is not able to compile file.c without knowing the full declaration of the person struct. That is typically why programmers either put the full declaration in a header file or we abstract the creation and the interaction away so that users cannot access the internals of our struct.Additionally, if the compiler knows the full length of an array object, it will use that in the expression instead of having it decay into a pointer. 这段代码将无法编译,因为 sizeof 无法在不知道 person 结构的完全声明的情况下编译 file.c。这就是为什么程序员通常在头文件中包含完整声明,或者通过抽象创建和交互来避免用户访问结构的内部。此外,如果编译器知道数组对象的完整长度,它会在表达式中使用它,而不是将其退化为指针。例如:

char str1[] = "will be 11";
char* str2 = "will be 8";
sizeof(str1) //11 because it is an array 11,因为这是一个数组 
sizeof(str2) //8 because it is a pointer 8,因为这是一个指针  

Be careful, using sizeof for the length of a string! 使用 sizeof 对字符串长度进行计算时要小心!

  1. static is a type specifier with three meanings.
  • (a) When used with a global variable or function declaration it means that the scope of the variable or the function is only limited to the file.
  • (b) When used with a function variable, that declares that the variable has static allocation – meaning that the variable is allocated once at program startup not every time the program is run, and its lifetime is extended to that of the program.
// visible to this file only
static int i = 0;
static int _perform_calculation(void) {
  // ...
}
char *print_time(void) {
  static char buffer[200]; // Shared every time a function is called
  // ...
}
  1. struct is a keyword that allows you to pair multiple types together into a new structure. C-structs are contiguous regions of memory that one can access specific elements of each memory as if they were separate variables. Note that there might be padding between elements, such that each variable is memory-aligned(starts at a memory address that is a multiple of its size).
struct hostname {
const char *port;
const char *name;
const char *resource;
}; // You need the semicolon at the end
// Assign each individually

struct hostname facebook;
facebook.port = "80";
facebook.name = "www.google.com";
facebook.resource = "/"

// You can use static initialization in later versions of c
struct hostname google = {"80", "www.google.com", "/"};
  1. switch case default Switches are essentially glorified jump statements. Meaning that you take either a byte or an integer and the control flow of the program jumps to that location. Note that, the various cases of a switch statement fall through. It means that if execution starts in one case, the flow of control will continue to all subsequent cases, until a break statement.
switch(/* char or int */) {
  case INT1: puts("1");
  case INT2: puts("2");
  case INT3: puts("3");
}

if we give a value of 2 then:

switch(2) {
  case 1: puts("1"); /* Doesn’t run this */
  case 2: puts("2"); /* Runs this */
  case 3: puts("3"); /* Also runs this */
}

One of the more famous examples of this is Duff’s device which allows for loop unrolling. You don’t need to understand this code for the purposes of this class, but it is fun to look at [2].

send(to, from, count)
register short *to, *from;
register count;
{
  register n=(count+7)/8;
  switch(count%8){
    case 0: do{ *to = *from++;
    case 7: *to = *from++;
    case 6: *to = *from++;
    case 5: *to = *from++;
    case 4: *to = *from++;
    case 3: *to = *from++;
    case 2: *to = *from++;
    case 1: *to = *from++;
    }while(--n>0);
  }
}

This piece of code highlights that switch statements are goto statements, and you can put any code on the other end of a switch case. Most of the time it doesn’t make sense, some of the time it just makes too much sense.

  1. typedef declares an alias for a type. Often used with structs to reduce the visual clutter of having to write ‘struct’ as part of the type. typedef 声明了一个类型的别名。通常与结构体一起使用,以减少在类型中写“struct”的视觉混乱。
typedef float real;
real gravity = 10;
// Also typedef gives us an abstraction over the underlying type used.
// In the future, we only need to change this typedef if we
// wanted our physics library to use doubles instead of floats.

// typedef 也为我们提供了对使用的基础类型的抽象。  
// 将来,我们只需要改变这个 typedef,如果我们想让我们的物理库使用双精度浮点数而不是浮点数。

typedef struct link link_t;
//With structs, include the keyword ’struct’ as part of the original types
// 与结构体一起使用,将关键字 "struct" 作为原始类型的一部分包含在内。 

In this class, we regularly typedef functions. A typedef for a function can be this for example: 在本课程中,我们经常使用 typedef 定义函数。一个函数 typedef 的例子如下:

typedef int (*comparator)(void*,void*);

int greater_than(void* a, void* b){
return a > b;
}

comparator gt = greater_than;

This declares a function type comparator that accepts two void* params and returns an integer.

这声明了一个名为 comparator 的函数类型,它接受两个 void* 参数,并返回一个整数。

  1. union is a new type specifier. A union is one piece of memory that many variables occupy. It is used to maintain consistency while having the flexibility to switch between types without maintaining functions to keep track of the bits. Consider an example where we have different pixel values. union 是一种新类型声明符。一个 union 是由多个变量占用的一块内存。它用于在保持一致性的同时,具有灵活性,可以在不维护跟踪位函数的情况下,切换不同类型的变量。考虑一个示例,其中我们有不同的像素值。
union pixel {
  struct values {
    char red;
    char blue;
    char green;
    char alpha;
  } values;
  uint32_t encoded;
}; // Ending semicolon needed

union pixel a;
// When modifying or reading
a.values.red;
a.values.blue = 0x0;
// When writing to a file
fprintf(picture, "%d", a.encoded);
  1. unsigned is a type modifier that forces unsigned behavior in the variables they modify. Unsigned can only be used with primitive int types (like int and long). There is a lot of behavior associated with unsigned arithmetic. For the most part, unless your code involves bit shifting, it isn’t essential to know the difference in behavior with regards to unsigned and signed arithmetic. unsigned 是一种类型修饰符,它强制修改的变量表现出无符号行为。无符号只能与基本整数类型 (如 intlong) 一起使用。与无符号算术有关的行为有很多。大多数情况下,除非您的代码涉及位旋转,了解无符号算术和行为之间的区别并不是必要的。

  2. void is a double meaning keyword. When used in terms of function or parameter definition, it means that the function explicitly returns no value or accepts no parameter, respectively. The following declares a function that accepts no parameters and returns nothing. void 是一个具有双重意义的关键字。当它用于函数或参数定义时,它意味着函数明确地返回无值或不接受参数。以下声明了一个不接受参数且返回值为空的函数:

void foo(void);

The other use of void is when you are defining an lvalue. A void * pointer is just a memory address. It is specified as an incomplete type meaning that you cannot dereference it but it can be promoted to any time to any other type. Pointer arithmetic with this pointer is undefined behavior. 另一个使用 void 的地方是在定义 lvalue 时。void * 指针只是内存地址。它被指定为不完整类型,这意味着你不能对它进行解引用,但它可以转换为任何其他类型。使用这个指针进行指针算术是未定义行为。

int *array = void_ptr; // No cast needed // 不需要类型转换 
  1. volatile is a compiler keyword. This means that the compiler should not optimize its value out. Consider the following simple function. volatile 是一个编译器关键字,这意味着编译器不应该对其值进行优化。考虑以下简单的函数:
int flag = 1;
pass_flag(&flag);
while(flag) {
  // Do things unrelated to flag
}

The compiler may, since the internals of the while loop have nothing to do with the flag, optimize it to the following even though a function may alter the data. 尽管循环内部与 flag 毫无关联,但由于循环内部的细节与 flag 无关,编译器可能会将其优化为以下形式:

while(1) {
  // Do things unrelated to flag
}

If you use the volatile keyword, the compiler is forced to keep the variable in and perform that check. This is useful for cases where you are doing multi-process or multi-threaded programs so that we can affect the running of one sequence of execution with another. 如果你使用 volatile 关键字,编译器将被强制保留变量并对其进行检查。这对于编写多进程或多线程程序非常有用,因为我们可以使用另一个执行序列来影响一个执行序列的执行。

  1. while represents the traditional while loop. There is a condition at the top of the loop, which is checked before every execution of the loop body. If the condition evaluates to a non-zero value, the loop body will be run.

C data types C语言数据类型

There are many data types in C. As you may realize, all of them are either integers or floating point numbers and other types are variations of these.

  1. char Represents exactly one byte of data. The number of bits in a byte might vary. unsigned char and signed char means the exact same thing. This must be aligned on a boundary (meaning you cannot use bits in between two addresses). The rest of the types will assume 8 bits in a byte.
  2. short (short int) must be at least two bytes. This is aligned on a two byte boundary, meaning that the address must be divisible by two.
  3. int must be at least two bytes. Again aligned to a two byte boundary [5, P. 34]. On most machines this will be 4 bytes.
  4. long (long int) must be at least four bytes, which are aligned to a four byte boundary. On some machines this can be 8 bytes.
  5. long long must be at least eight bytes, aligned to an eight byte boundary.
  6. float represents an IEEE-754 single precision floating point number tightly specified by IEEE [1]. This will be four bytes aligned to a four byte boundary on most machines.
  7. double represents an IEEE-754 double precision floating point number specified by the same standard, which is aligned to the nearest eight byte boundary.

If you want a fixed width integer type, for more portable code, you may use the types defined in stdint.h, which are of the form [u]intwidth_t, where u (which is optional) represents the signedness, and width is any of 8, 16, 32, and 64

Operators 操作符