C语言陷阱与技巧第46节，如何获取线程函数的执行结果？怎么知道线程函数是成功了还是失败了？

在本专栏的第41节，我们介绍了C语言程序开发中，有些任务可以放在后台运行，以避免主逻辑阻塞，造成C语言程序进入“未响应”等假死状态，影响用户使用体验。

获取线程函数的返回值

在C语言程序开发中，需要放在后台运行的任务通常以线程的形式实现，这一点我们在第 41 节已经举例较为详细的介绍过。不过，第 41 节的例子并未关心线程函数的返回值，而是将其设置为 detached ，以避免资源泄漏。

如果需要得到线程函数的返回值，该怎么做呢？线程函数是后台运行的，乍一看似乎连它什么时候结束都很难知道。

在第 41 节的例子中，主逻辑是通过 while(1) 死循环等待线程函数执行结束的，不过这只适合作为示例，实际的C语言程序开发中，通过死循环等待线程结束的做法肯定是不提倡的。

事实上，C语言的 pthread 库为我们提供的 pthread_join() 函数，就是专门等待线程函数结束，并且获取其返回值的。请看下面这段C语言代码示例：

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void delay()
{
    unsigned long cnt = 999999;
    while(cnt--);
}

void *thread_1(void *p)
{
    int cnt = 10;
    while(cnt--){
        delay();
        printf("thread_1 is running, cnt: %d\n", cnt);
    }
    return NULL;
}

int main()
{
    pthread_t pid;
    pthread_create(&pid, NULL, thread_1, NULL);

    printf("\nmain is running...\n\n");

    void *ret;
    pthread_join(pid, &ret);

    printf("thread exit with ret: %p\n", ret);

    return 0;
}

为了尽量简洁，便于讨论主题，示例C语言代码没有做错误处理。

上面这段C语言代码很简单，线程函数 thread_1() 打印 10 次 "thread_1 is running...”， main() 函数创建线程后，调用 pthread_join() 函数等待线程函数结束，并接收其返回值。编译并执行上述C语言代码，得到如下输出：

# gcc t.c -lpthread
# ./a.out 

main is running...

thread_1 is running, cnt: 9
thread_1 is running, cnt: 8
thread_1 is running, cnt: 7
thread_1 is running, cnt: 6
thread_1 is running, cnt: 5
thread_1 is running, cnt: 4
thread_1 is running, cnt: 3
thread_1 is running, cnt: 2
thread_1 is running, cnt: 1
thread_1 is running, cnt: 0
thread exit with ret:  (nil)

可见，pthread_join() 函数的确可以使 main() 等到线程函数 thread_1() 完成退出，根据上面的输出内容，我们也可以知道 thread_1() 函数返回了 NULL。读者感兴趣的话，可以修改 thread_1() 的返回值，例如：

void *thread_1(void *p)
{
    ...
    return 8;
}

编译修改后的C语言代码并执行，得到如下输出：

...
thread exit with ret: 0x8

与预期一致。

调用pthread_join()后，线程函数还会“泄漏”资源吗？

之前的文章里说到，在C语言程序开发中创建线程函数，如果不做恰当处理，该线程函数使用的栈等资源无法被系统回收，会造成“资源泄漏”。久而久之，就可能导致整个C语言程序崩溃。

线程函数运行完毕后，系统不回收它的资源，是因为系统不能确定是否仍然有程序关心该线程函数的执行结果。在第 41 节中，我们调用 pthread_detach() 函数告诉系统：没有人关心该线程函数的返回值。所以线程执行完毕后，系统就立刻回收它使用的资源了。

需要说明的是，因为 main() 函数调用 pthread_join() 接收了线程函数 thread_1() 的返回值，也即“有人关心”thread_1() 的执行结果，所以不允许再为 thread_1() 函数设置 detached 标志，这一点要注意。

相对的，如果某个线程函数被设置为 detached，就不能再调用 pthread_join() 等待接收其返回值了。也就是说，在C语言程序开发中，pthread_detach() 和 pthread_join() 二者只能取其一。

可能读者有疑问了，如果C语言程序调用了 pthread_join()，那线程函数不是 detached，会造成“资源泄漏”吗？得到答案最直接的办法就是写代码测试，我们修改 main() 函数的C语言代码：

int main()
{
    int i;

    pthread_t pid[10];
    for(i=0; i<10; i++)
        pthread_create(pid+i, NULL, thread_1, NULL);

    printf("\nmain is running...\n\n");

    void *ret;
    for(i=0; i<10; i++)
        pthread_join(pid[i], &ret);

    printf("thread exit with ret: %p\n", ret);

    return 0;
}

上述 main() 函数创建了 10 个线程，并且分别调用 pthread_join() 等待接收线程返回值。读者可自行编译执行这段修改后的C语言代码，像第 41 节一样使用 top 命令对比资源使用情况，应该能够发现是没有资源泄漏的。

这是肯定的，之前系统不回收执行完毕的线程资源，是因为系统不能确定是否仍然有程序关心线程的执行结果，现在 main() 函数调用 pthread_join() 获取了线程的执行结果，系统知道 main() 函数处理了线程的执行结果，于是就可以放心的把资源回收了。

值得说明的是，上述 main() 函数的C语言代码，有些可能会写成下面这样，请看：

int main()
{
    int i;
    pthread_t pid;
    void *ret;
    for(i=0; i<10; i++){
        pthread_create(&pid, NULL, thread_1, NULL);
        pthread_join(pid, &ret);
    }
    ...
    return 0;
}

这其实是一个陷阱，初学者很容易跳进去。这段C语言代码有什么问题呢？其实很明显，pthread_join() 函数会等待线程函数执行完毕，然后获取其返回值，也就是说，上面这段C语言代码并不会创建 10 个同时（假设平台10核以上）运行的线程，而是每创建一个线程，就等待其执行完毕，再创建下一个。

小结

本节主要讨论了C语言程序开发中，获取线程返回值的方法。其实就是 pthread_join() 函数的应用，读者应小心文章最后讨论的“陷阱”。pthread_join() 函数也可以通知系统在线程运行完毕后回收其资源，避免泄漏。但是应该注意，在C语言程序开发中，pthread_detach() 和 pthread_join() 二者只能取其一，实际上，在创建线程后，必须从二者中选择一个：如果主逻辑不关心线程何时结束，也不关心其执行结果，则应该将其设置为 detached。否则，就该调用 pthread_join() 了。