My computer just read Xi’s work report – here are the results

China watchers have been counting the days until the beginning of the 19th Party Congress in China. Although there are less surprises than, say, in the German, Austrian or US elections, the days of the congress are a great time for speculating about the meaning of tie colours and why Xi Jinping used, or did not use, some particular word.

So, I was psyched up like a five-year-old on Christmas eve and was getting ready to marinate myself in documents and media reports in the next few days. While the undisputed climax of the show is the (s)election of the Central Committee, the Politburo and the Politburo’s Standing Committee towards the end of the congress, the General Secretary’s work report is also something to greatly look forward to.

It goes without saying that I could not wait to get my hands on the document, and I even thought that I might start my day leafing through the speech while having my morning coffee. The speech, it turns out, is only three hours long, the document a crisp 32.500 Chinese characters. A piece of cake for a seasoned reader of CCP documents!

But it should not be. Not only the Communist Party, but also the Department of East Asian Studies likes to convene important meetings on Wednesdays, so it was early afternoon before I even had a chance to look at the news and hunt for the document. Given that, as I read yesterday, peoples’ attention spans are now shorter than those of goldfish, it is not surprising that I got sidetracked by news articles about the speech while looking for the speech itself.

I am interested in the statistical analysis of text corpora, so two reports that counted how often certain words appeared in the document caught my eye. Bloomberg showed that the word “environment” appeared more frequently than the word “economy”, and Deutsche Welle’s Chinese edition found it newsworthy that Xi said “modernisation” more often than at the previous congress.

These articles combined with time pressure and the fact that the CPUs in my computer were still warm from yesterday’s course on Machine Learning in Chinese studies provided a fertile enough ground to engage in some productive procrastination: instead of reading the Work Report myself, I would feed it into a topic modelling algorithm (I used Mallet for this exercise) and write about the experience.

I quickly located the report, transformed it into a text document, and cut it up into paragraphs. The algorithm was instructed to identify fifteen topics. Fifteen, I think, is enough to get a broad idea about what is mentioned in the document, yet not too much to distract non-specialists from reading the post (I hope). To make the results accessible to readers who do not read Chinese, I had google automatically translate every word in the topic model (remember, I am procrastinating!).

Here is what the algorithm came up with. More frequent topics are listed first, and more frequent words in each topic are at the beginning of each sentence.

(Disclaimer: for the sake of my discipline I should also advise you that we’re merely having fun here and that if you really want to know what is in the document, you should read through it. Topic models can be treacherous, because they invite you to make up stories that might have little to do what actually is in the corpus. Proceed at your own risk!)


Topic model, 15 topics

Construction system forwarding strengthening comprehensive system insisted reform perfect Innovation

Socialism Features Culture Construction Times insist thought Theory People road

People implementation History create National led struggle Hold political parties the dream

Modernization socialism Full Society National built World Construction Economy Strategy

Democracy Politics system the people the rule of law consultation National socialism Strengthening Monitoring

Cooperation human to promote Open World insisted Destiny Worldwide Trade Community

Security National Maintenance struggle interest split faced to deal Challenge contradiction

Army Strongarmy Defense pay attention Military implementation stick Construction era force

Social Governance People basic protection Employment livelihood Income realized Horizontal

Adhere to the masses discipline Supervision test Politics reinforced Anti-corruption coverage a problem

Politics Enhanced work Cadres skills Awareness insisted leadership good mechanism

Eco Green Production Protection Action Prevention Environment institution Regulation responsibility

Economy Optimization Regional Growth supply elements nurture entity to promote Structural

Talent Arts power strengthening Regional Research lead everyone Acura oriented

Education run Religion Society grassroots organizations propaganda people to accept on non-public


My first impression is very positive – I find it quite nice that the topics work out so well even with a comparatively short document!

The contents – well, very comprehensive! The report covers a lot of ground, addressing diverse issues like China’s particular (peculiar) brand of developmental socialism, its political system, calls for for global corporation, defence and national security, managing the masses, the quality of China’s cadre force, environmental policies and “green” manufacturing, and education. The three first topics probably sound familiar to those studying Xi Jinping – obviously, a good part of the speech was about exhorting the 90 million CCP members to help make the Chinese Dream a reality, with importance given to both theory and the lessons from history (how about Looking Back While Marching Forward as a catchy slogan for Xi’s theoretical contribution? I heard the slogans devised so far are quite unwieldy. You are welcome.)

Judging from these topics, the issues addressed in this work report are quite similar to those in earlier work reports. Work reports are always very comprehensive, and social issues, defence, self-improvement and modernisation have always been on the agenda, as has been the case with reflections on China’s political system (often in contrast to liberal democracy) and, at least in the last decade or so, environmental protection.

Against this background, it is interesting to see how much each topic is stressed. The figure below gives an idea how much each topic “dominates” the document.


Frequency distribution of document-topic scores

Seeing how prominent the first three topics are, I would wager that the document comes across as more ideological than technocratic – with modernisation not for modernisation’s sake, but with all policies converging on the vision of a strong and prosperous China that leads, but also cooperates, that looks forward, but also learns from the past. It is also interesting that the “political system” topic is so strong, and that, judging from the associated terms, the next five years will see more of Xi’s brand of party dominance coupled with a heavy focus on internal discipline and carefully dosed public “participation”.

“Democracy”, the strongest term in the topic, should not be misunderstood as a commitment to democratisation. By CCP standards, China is already democratic, and recently the Party’s Xinhua News agency even claimed that Chinese democracy “has a higher level of quality and efficiency” than its Western counterparts. In the yin-yang fashion of CCP official discourse, “more democracy” might really mean more (active and passive) consultation, but less democracy.

Or maybe not. Guess I’ll need to read the report to know what it really says…