8/29/2022

銀級贊助商 - Reuven Lerner - An intro to Python bytecodes

One of the most common myths that I encounter in my corporate training is that Python is an interpreted language. It's not really surprising that people believe that -- after all, Python is often referred to as a "scripting" language, and often has the feel of an interpreted language, one that is translated into machine code one line at a time.

But in fact, Python is a byte-compiled language: First, the code that you write is translated into bytecodes -- an interim, portable format that resembles a high-level assembly language. When you run your program, those bytecodes are executed by the Python runtime. This is pretty similar to how things work in a number of other platforms, including .NET and Java -- but the process in Python is so transparent that we often don't think about it.

This is often easiest to see when we define a function. Whenever we use "def", we actually do two things: First, we create a function object. Then we assign that object to a variable.  Both of these seemingly simple steps can be a bit surprising, even to people who have been using Python for many years.

First, the notion that Python has "function objects" seems a bit weird. But really, it's part of Python's overall philosophy that everything is an object. Every string is an instance of class "str", every dictionary is an instance of class "dict", and every function is an instance of class "function". (Note that while both "str" and "dict" are builtin names, "function" is not.) The fact that functions are objects allows us to store them in lists and dicts, and to pass them as arguments to other functions (e.g., the "key" parameter in the builtin "sorted" function). The fact that functions are objects also means that they have attributes, names following dots (.) that act like a private dictionary.

The fact that "def" assigns our newly created function object to a variable is also a bit surprising to many, especially those coming from languages in which functions and data are in separate namespaces. Python has only a single namespace, which means that you cannot have both a variable named "x" and a function named "x" at the same time.

So if I execute the following code in Python:

    def hello(name):

        return f'Hello, {name}!'

I have assigned a new value, a function object, to the variable "hello".  I can even ask Python what type of object the variable refers to, using the "type" builtin:

    >>> type(hello)

    function

It doesn't matter what "hello" might have referred to before; once we have executed "def", the variable "hello" now refers to a function object. We can call our function with parentheses:

    >>> hello('world')

Not surprisingly, we get the following back:

    'Hello, world!'

What happens, though, when we execute our function? In order to understand that, we'll need to have a close look at what is done at compile time (i.e., when we define our function) and at runtime (i.e., when we actually run our function).

I mentioned above that when we define a function, we create a function object, and that the object (like all others in Python) has attributes. The most interesting attribute on a function object is called "__code__" (pronounced "dunder-code" in the Python world, where "dunder" means "double underscore before and after a name"). This is the code object, the core of what is defined when we create a function. The code object itself has a number of attributes, the most interesting of which all start with "co_".  We can see a full list with the "dir" builtin:

    >>> dir(hello.__code__)

Here's a list of the attributes (a subset of the list that you'll get from running "dir") that start with co_:

['co_argcount',

 'co_cellvars',

 'co_code',

 'co_consts',

 'co_filename',

 'co_firstlineno',

 'co_flags',

 'co_freevars',

 'co_kwonlyargcount',

 'co_lines',

 'co_linetable',

 'co_lnotab',

 'co_name',

 'co_names',

 'co_nlocals',

 'co_posonlyargcount',

 'co_stacksize',

 'co_varnames']

I wrote above that when we define a function, Python compiles it into bytecodes. Those are stored inside of the co_code attribute. We can thus see the bytecodes for a function by looking at it:

    >>> print(hello.__code__.co_code)

The good news is that this works. But the bad news is that it's pretty hard to understand what's going on here:

    b'd\x01|\x00\x9b\x00d\x02\x9d\x03S\x00'

What we see here is a bytestring, a sequence of bytes -- as opposed to a sequence of characters, which is what we would have in a normal Python string. This is the code that Python executes when we run our function.

But wait -- what are these codes? What do they mean, and what do they do? In order to understand, we can use the "dis" function in the "dis" module. That module (and its function) are short for "disassemble," and they allow us to break apart the function and see it:

    >>> import dis

    >>> dis.dis(hello)

      2           0 LOAD_CONST               1 ('Hello, ')

                  2 LOAD_FAST                0 (name)

                  4 FORMAT_VALUE             0

                  6 LOAD_CONST               2 ('!')

                  8 BUILD_STRING             3

                 10 RETURN_VALUE

Things might now start to make more sense, even though we've also opened up a bunch of additional new mysteries.  The (CAPITALIZED) names that we see are the bytecodes, the names of the pseudo-assembly commands that Python recognizes.  The integers to the left of each command indicates the index into co_code with which each bytecode is associated.

So the byte at index 0 is for LOAD_CONST. The byte at index 2 is LOAD_FAST. And the byte at index 4 is FORMAT_VALUE.

But wait: What do these commands do? And why are we only using the even-numbered bytes?

The LOAD_CONST instruction tells Python to load a constant value. We're not talking about a constant in the general language, but rather a constant value that was assigned to the function object when it was compiled. At compile time, Python noticed that there was a string, 'Hello, '. It stored that string as a constant on the function object, in a tuple named co_consts. The function can thus retrieve that constant whenever it needs.  We can, of course, look at the co_consts tuple ourselves:

    >>> hello.__code__.co_consts

    (None, 'Hello, ', '!')

As you can see, the element at index 1 in our function's co_consts is the string 'Hello, '.  So the first bytecode loads that constant, making it available to our Python interpreter.  But wait, where did this constant come from? Look carefully, and you'll see that it's the first part of the f-string that we return in the body of the function. That's right -- while we think of an f-string as a static string with a dynamic component (inside of the {}), Python thinks of it as the combination of static parts (which are stored in co_consts as strings) and dynamic parts (which are evaluated at runtime).

So our f-string, which looks like this:

    f'Hello, {name}!'

Is turned by the Python compiler into

    'Hello, ' (constant) + name (variable lookup) + '!' (constant)

And indeed, we can see that co_consts[1] is 'Hello, ', and co_consts[2] is the single-character string '!'.  In between, we'll need to get the value of the "name" variable.

In order to do this, Python needs to know if "name" is a local variable or a global one. In this case, it's an easy call: Because "name" is a parameter to our function, it is by definition a local variable. Local variable values are retrieved using the LOAD_FAST bytecode, which we see at byte index 2. But how does it know which local variable to retrieve?

Fortunately, our function object also has an attribute named co_vars, a tuple of strings with all of the local variable names:

    >>> hello.__code__.co_varnames

    ('name',)

So the argument 0 which is given to LOAD_FAST indicates that we want to retrieve the value of local variable 0, aka "name".  In the first two bytecodes, we thus load a constant and a variable name. Then Python uses the special FORMAT_VALUE bytecode to format our "name" variable:

      2           0 LOAD_CONST               1 ('Hello, ')

                  2 LOAD_FAST                0 (name)

                  4 FORMAT_VALUE             0

Usually, formatting a value means turning it into a string using "str".  But some objects have a special "__format__" method defined, which allows them to have a special output in this context.

We now have two strings on our stack -- and yes, the Python runtime is a stack machine, which you might have learned about if you studied computer science. But we need the exclamation point, so we load that, too:

                6 LOAD_CONST               2 ('!')

We now have three strings on the stack -- our initial constant, the formatted version of "name", and the constant '!'.  We now create a string, based on these three components, with another bytecode, BUILD_STRING. We hand BUILD_STRING an argument of 3, to indicate that it should crate a string from the three topmost items on the stack:

                8 BUILD_STRING             3

And that's it! We have created the string that we wanted, based on the user's argument. The time has come to return that value, and we do so with the special RETURN_VALUE bytecode:

               10 RETURN_VALUE

How often do you really need to read Python bytecodes? Never. But reading the bytecodes does give you a sense of how Python works, what it's doing behind the scenes, how particular functionality (e.g., f-strings) are implemented, and which decisions are made at compile time, rather than runtime.  Understanding Python's division of labor between compile time and runtime can, in my experience, help to make sense of error messages you get, and also to put into context so many other parts of Python that can see mysterious.

I'll be talking about these and other parts of Python bytecodes, especially through the lens of functions, at PyCon APAC 2022, in my talk, "Function dissection lab." I hope to see you there!

8/28/2022

白金級贊助商 - PyCon APAC 2022 x 國泰金控 Gather Town活動搶先預告 技術短講&人才諮詢&限量好禮我全都要!

想知道如何在金融業運用數據力解決問題嗎?

對金融業技術工作有憧憬,但找不到人了解更多嗎?

國泰金控PyCon專屬攤位在 Gather Town 開張囉!還有好禮可以拿!

國泰長期致力數位轉型,積極透過「大數據」及「創新技術」打造內外部服務生態圈。今年國泰攤位將邀請國泰人壽、國泰產險、國泰世華銀行等專家,進行3場短講,分享子公司如何透過數據創造新機會。此外,國泰攤位也提供互動諮詢,不管是對金融業有興趣、想更了解產業工作實況,都歡迎來找我們聊聊喔!

攤位好禮別錯過:

好禮1:參與攤位技術短講,就有機會抽100元7-11禮券

好禮2:完成小任務「填問券、入社團」就有機會各獲得50元Uber eat禮卷!(數量有限,送完為止,獲得者可兌換Uber Eats禮卷乙次,唯使用禮券時需一次性使用完,相關使用規則將於禮券寄出時載明)

國泰人才召募:https://bit.ly/3IOFsFU

無限大實驗室社團:https://bit.ly/fintechinfinitylab

技術短講資訊:將於國泰Gather Town SpaceA攤位進行

場次一:最適業務員配對

時間:9/3 (SAT)11:15~11:25

講者:國泰人壽 數據經營部 詹珺崴分析師 / 黃喬敬 經理

內容:保險商務的推動多仰賴業務員推動,但到底哪個業務員最適合客戶呢? 國泰團隊以網路萃取特徵,結合客戶與業務員的互動紀錄進行機器學習,成功建構了一套業務員推薦系統,協助推薦客戶最適合的業務員。

場次二:Restful to Kafka 即時模型評分服務實現

時間:9/3 (SAT) 14:35~14:45

講者:國泰世華銀行 數據部林子暐

內容:數據部門產出的模型結果需要更即時地送達前線系統,造成Restful API請求的架構漸漸呈現流程繁瑣和重複的情況。因此團隊規劃使用容器化的技術,實現串流資料的處理和服務負載的監控機制,將多個模型評分送到流處理平台Kafka,進而優化上下游系統取得各自需要模型評分的流程。

場次三:智能商險平台-以科技驅動業務模式變革

時間:9/4 (SUN) 11:05~11:15

講者:國泰產險 數據科技發展部 李郁芳

內容:如何透過科技技術改造複雜的商業險業務、加速保險商品的購買流程?團隊透過數據中台、爬蟲技術、機器學習、空間運算等數據技術,讓過去需等待幾天的報價,現在可於客戶面前2~3分鐘內就能處理完畢。這不僅讓商業險報價變得更簡單,減少客戶等待時間!同時,業務員變得更專業,增加公司獲利可能!本案技術也已於今年取得新型專利,具高度商業應用價值。

 

#國泰技術短講 #人才諮詢 #獨家獻禮


 

8/25/2022

白金級贊助商 - 美光智慧製造,加速全球半導體產業發展

美光於 2014 年底開始採用大數據技術。利用新的智慧製造技術,美光全球營運 (Micron Global Operations) 團隊能部署複雜的半導體架構、程序以及技術,來創新與打造新產品,進而為客戶、投資人以及企業帶來價值。

台灣美光在技術方面扮演什麼角色?

台灣美光在開發美光領先的 DRAM 產品方面扮演關鍵性角色。身為一個高產量的 DRAM 卓越製造中心,台灣美光致力於採用尖端科技生產 DRAM,提供伺服器、個人電腦、GPU、手機、高效能運算及其他領域來使用。

美光台灣廠區運用智慧製造,加速創新來改善產品的品質及效率。透過智慧製造技術,台灣美光在勞工生產力相關各項指標獲得顯著提升,並且縮短了學習週期來提高產量,且節省了大量的能源消耗。

美光智慧製造(Smart Manufacturing & AI, SMAI) 副總裁 Koen De Backer 表示:「此一認可見證了美光在採用和整合工業 4.0 技術,以及形塑半導體製造未來方面的成功。我們會透過在各廠區落實採用工業 4.0 技術,持續引領半導體製造的未來,為顧客提供更高效能的產品。」

大規模部署智慧製造的關鍵因素

透過採用工業 4.0 技術,美光可減少勞力密集的營運,並且重新設計各個工作項目使它發揮出更多價值。這些改變並不會減縮員工人數,但是會提升員工的技能。我們將能夠找出哪些團隊成員適合技能提升以及職涯發展,提供在職及課堂訓練,並且安排他們擔任其他職務。

透過採用工業 4.0 技術,美光讓團隊成員隨時能夠取用資料,方便他們從遠端執行任務以及監控多項工具的健全情況及狀態,不需要親臨廠區或位在該設備附近。

美光智慧製造  (Smart Manufacturing & AI, SMAI)

美光智慧製造,其中包含運用雲端先進科技的跨國數據團隊SMTS (SMAI technical solution team),透過巨量資料處理技術和雲端服務,整合並建構標準化解決方案與平台,以加速美光公司內的智慧製造,成功推動各式人工智慧應用,包含製程量產優化,品質流程控管和優化,以及自動化減少人工操作。

如果你喜歡巨量資料處理,想參與建構半導體先進製程的相關數據和人工智慧應用,期盼加入來自全球優秀頂尖人才的團隊,美光智慧製造團隊歡迎你

想了解更多,請前往美光官網 micron.com