#+Title: 1 ─ Introduction #+Subtitle: “Principles of programming languages” #+Author: Mark Armstrong, PhD Candidate, McMaster University #+Date: Fall, 2019 #+Description: A brief overview of topics we will be discussing in this course. #+Options: toc:nil #+Reveal_root: http://cdn.jsdelivr.net/reveal.js/3.0.0/ #+Reveal_init_options: width:1600, height:900, controlsLayout:'edges', #+Reveal_init_options: margin: 0.1, minScale:0.125, maxScale:5 #+Reveal_extra_css: local.css #+html: * LaTeX Header :ignore: #+LaTeX_header: \usepackage{amsthm} #+LaTeX_header: \theoremstyle{definition} #+LaTeX_header: \newtheorem{definition}{Definition}[section] #+LaTeX_header: \newenvironment{NOTES}{ #+LaTeX_header: \begin{center} #+LaTeX_header: \begin{tabular} #+LaTeX_header: {|p{0.9\textwidth}|} #+LaTeX_header: \hline\\ #+LaTeX_header: \begin{scriptsize} #+LaTeX_header: }{ #+LaTeX_header: \end{scriptsize} #+LaTeX_header: \\\\\hline #+LaTeX_header: \end{tabular} #+LaTeX_header: \end{center} #+LaTeX_header: } * Preamble ** Notable references - Concepts of Programming Languages, Sebesta, 10th ed. - Section 1.3 - Language Evaluation Criteria - Section 5.4 - The Concept of Binding - Section 5.5 - Scope - Concepts, Techniques, and Models of Computer Programming - Preface - Section 2.1.1 - Language syntax ** Update history - Sept. 12 :: - The description of the lifetime of stack dynamic variables has been changed. - From: “lifetime is the amount of runtime spent in that scope or sub-scopes”. - To: “lifetime is the portion of runtime between when the unit of code containing the variable is entered and when control returns to the invoker of the unit of code.” ** Table of contents # The table of contents are added using org-reveal-manual-toc, # and so must be updated upon changes or added last. #+HTML: #+begin_scriptsize - [[Welcome!][Welcome!]] - [[Instructor: Mark Armstrong][Instructor: Mark Armstrong]] - [[Teaching assistant: Musa Alhassy][Teaching assistant: Musa Alhassy]] - [[Teaching assistant: Maryam Kord][Teaching assistant: Maryam Kord]] - [[About the notes and slides for this course][About the notes and slides for this course]] - [[Emacs][Emacs]] - [[Org mode][Org mode]] - [[Literate programming][Literate programming]] - [[Purpose and goals of this course][Purpose and goals of this course]] - [[Informal objectives][Informal objectives]] - [[Course preconditions][Course preconditions]] - [[Course postconditions][Course postconditions]] - [[Formal rubric for the course][Formal rubric for the course]] - [[Principles of programming languages][Principles of programming languages]] - [[A brief definition of “programming language”][A brief definition of “programming language”]] - [[The two components of every programming language][The two components of every programming language]] - [[Implementation strategy: another component?][Implementation strategy: another component?]] - [[Paradigms: classifying programming languages][Paradigms: classifying programming languages]] - [[The commonly discussed paradigms][The commonly discussed paradigms]] - [[A taxonomy of programming paradigms][A taxonomy of programming paradigms]] - [[One language, many paradigms][One language, many paradigms]] - [[Exercise: On paradigms][Exercise: On paradigms]] - [[Subjective criteria for comparing and evaluating languages][Subjective criteria for comparing and evaluating languages]] - [[Readability][Readability]] - [[Writability][Writability]] - [[Reliability][Reliability]] - [[Cost][Cost]] - [[Key concepts and terminology][Key concepts and terminology]] - [[Compiled, interpreted, and hybrid approaches][Compiled, interpreted, and hybrid approaches]] - [[Compilation][Compilation]] - [[Interpreters][Interpreters]] - [[Hybrid methods][Hybrid methods]] - [[Exercise: Researching implementations][Exercise: Researching implementations]] - [[Abstraction][Abstraction]] - [[Levels of abstraction][Levels of abstraction]] - [[Static vs. dynamic][Static vs. dynamic]] - [[Scope][Scope]] - [[Exercise: On dynamic scoping][Exercise: On dynamic scoping]] - [[Exercise: Entities which introduce scope][Exercise: Entities which introduce scope]] - [[Variables and memory][Variables and memory]] - [[The data segment, stack, and heap][The data segment, stack, and heap]] - [[Kinds of memory allocation][Kinds of memory allocation]] - [[Exercise: Memory allocation in C++][Exercise: Memory allocation in C++]] - [[Lifetime][Lifetime]] - [[Garbage collection][Garbage collection]] - [[Exercises][Exercises]] #+end_scriptsize #+HTML: * Welcome! #+begin_center #+attr_html: :style text-align:center Welcome to the course! #+end_center ** Instructor: Mark Armstrong #+begin_quote #+attr_org: :width 200px #+attr_html: :width 200px #+attr_latex: :width 200px #+attr_html: :alt Mark Armstrong photo [[./images/markarmstrong.jpg]] #+end_quote - Email: armstmp “at” mcmaster.ca - Website: http://www.cas.mcmaster.ca/~armstmp/ - Office: ITB-229 # - Office hours by request ** Teaching assistant: Musa Alhassy #+begin_quote #+attr_org: :width 200px #+attr_html: :width 200px #+attr_latex: :width 200px #+attr_html: :alt Musa Alhassy photo [[./images/musa_coffee.jpg]] #+end_quote - Email: alhassm “at” mcmaster.ca - Website: https://alhassy.github.io - Office: ITB-229 # - Office hours TBD ** Teaching assistant: Maryam Kord - Email: kordm “at” mcmaster.ca # - Office: ... # - Office hours TBD * About the notes and slides for this course - The notes/slides for this course are developed in Emacs using Org mode. - Along with almost everything else for the course! - Specifically, - the slides are created by exporting the Org files to HTML using the [[https://github.com/yjwen/org-reveal][org-reveal]] package, and - the document style ~pdf~'s are created by exporting the Org files to LaTeX. - The slides are laid out in two dimensions. - When viewing them in your browser, press ~?~ to see a help page describing the possible interactions. - Most importantly, press ~esc~ or ~o~ to see an overview of the slide deck. ** Emacs Emacs is #+begin_quote An extensible, customizable, free/libre text editor — and more. At its core is an interpreter for Emacs Lisp, a dialect of the Lisp programming language with extensions to support text editing. — GNU Emacs [[https://www.gnu.org/software/emacs/emacs.html][homepage]] #+end_quote - The learning curve is fairly steep compared to most similar tools, but: - Distributions (configurations) such as [[http://spacemacs.org/][Spacemacs]] and [[https://github.com/hlissner/doom-emacs][doom-emacs]] optionally ease the learning curve. - As the longest lived extensible text editor, Emacs has an unrivaled ecosystem of extensions. - Emacs is best configured /by writing code/, a natural exercise of your talents. - Especially if you have not settled on an editor for general use, check out Emacs! ** Org mode Even if you do not use Emacs, I recommend checking out Org mode. #+begin_quote Org mode is for keeping notes, maintaining TODO lists, planning projects, and authoring documents with a fast and effective plain-text system. — Org mode [[https://orgmode.org/][homepage]] #+end_quote While Org mode is used in plain-text documents, it supports exporting to various formats, including ~pdf~'s (through LaTeX), ~html~, and ~odt~, all “out of the box”! - Write once, export many. Packages for Org mode are available for both VSCode and Atom. ** Literate programming :Properties: #+Link: litprog http://www.literateprogramming.com/ :End: Org mode is especially useful in that it provides support for /literate programming/. #+begin_quote Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do. — Donald E. Knuth, “[[litprog:][Literate Programming]]” #+end_quote Regardless of the tools you choose to use, in this course it will be expected that you follow this philosphy. - The purpose of your programs here is to /explain to the course staff what you want the computer to do/. That is, documentation is not optional, and not to be an afterthought. * Purpose and goals of this course The undergraduate calendar description of this course is #+begin_quote Design space of programming languages; abstraction and modularization concepts and mechanisms; programming in non-procedural (functional and logic) paradigms; introduction to programming language semantics. #+end_quote But we should be more specific. ** Informal objectives - Expose you to several programming languages. - A relatively shallow but comprehensive survey. - Focussing on general-purpose languages. - /Formally/ describe programming language syntax and semantics. - An application of theory you have learned previously. - Analyse the building blocks of languages. - Programs can be /massive/ entities, but they are composed of relatively few /small/ pieces. - Provide you with criteria by which to judge languages. - So you can identify what languages fit what tasks. - Examine the origins of certain languages/groups of languages. - Historical context provides insight into why languages are designed the way they are. ** Course preconditions Before beginning this course: 1. Students should know and understand: 1. Basic concepts about integers, sets, functions, & relations. 2. Induction and recursion. 3. First order logic, axiomatic theories & simple proof techniques. 4. Regular expressions & context-free grammars. 5. Programming in imperative language 6. Basic concepts of functional programming languages. 2. Students should be able to: 1. Produce proofs involving quantifiers and/or induction. 2. Understand the meaning of a given axiomatic theory. 3. Construct regular sets & context-free languages. 4. Produce small to medium scale programs in imperative languages. 5. Produce small scale programs in functional languages. ** Course postconditions After completion of this course: 1. Students should know and understand: 1. The basics of several programming languages. 2. Formal definitions of syntax & semantics for various simple programming languages. 3. Various abstraction & modularisation techniques employed in programming languages. 2. Students should be able to: 1. Reason about the design space of programming languages, in particular tradeoffs & design issues. 2. Produce formal descriptions of syntax & semantics from informal descriptions, identifying ambiguities. 3. Select appropriate abstraction & modularisation techniques for a given problem. 4. Produce (relatively simple) programs in various languages, including languages from non-procedural paradigms. ** Formal rubric for the course # This HTML is probably a bad hack... but it works as a hammer. #+HTML: #+begin_scriptsize +--------------+------------+--------------+------------+------------+ |Topic | Below | Marginal | Meets | Exceeds | +--------------+------------+--------------+------------+------------+ |Familiarity |Shows some |Shows |Achieves |Achieves | |with various |competence |competence |competence |competence | |programming |in |in |with the |with | |languages |procedural |procedural |basic |intermediate| |(PLs) |languages, |languages |usage of |usage of | | |but not |and limited |various |various | | |languages |competence |languages |languages | | |from other |in | | | | |paradigms |languages | | | | | |from other | | | | | |paradigms | | | +--------------+------------+--------------+------------+------------+ |Ability to |Cannot |Identifies |Identifies |Identifies | |identify and |consistently|such |such |sucj | |make use of |identify |constructs, |constructs |constructs | |abstraction, |such |but does not |and shows |and shows | |modularisation|constructs |consistently |some ability|mastery of | |constructs | |make use of |to make use |them when | | | |them when |of them when|programming | | | |programming |programming | | +--------------+------------+--------------+------------+------------+ |Ability to |Unable or |Comprehends |Makes only |Consistently| |comprehend and|rarely |given |minor |fully | |produce formal|able to |grammars, |errors |understands | |descriptions |comprehend |but |regarding |given | |of PL syntax |given |produces |precedence |grammars and| | |grammars; |grammars |or |produces | | |does not |which are |ambiguity |correct | | |identify |ambiguous |when |grammars. | | |ambiguity |or which do |reading or | | | |or |not |producing | | | |precedence |correctly |grammars | | | |rules |specify | | | | | |precedence | | | +--------------+------------+--------------+------------+------------+ |Ability to |Rarely or |Usually |Comprehends |Comprehends | |comprehend and|never |comprehends |such |such | |produce |comprehends |such semantic |semantic |semantic | |operational |such |descriptions, |descriptions|descriptions| |semantics for |semantic |but cannot |and produces|and produces| |simple PLs |descriptions|consistently |them with |them without| | | |produce them |only minor |errors | | | | |errors | | +--------------+------------+--------------+------------+------------+ |Ability to |Rarely or |Inconsistently|Consistently|Consistently| |comprehend |never |comprehends |comprehends |comprehends | |denotational |comprehends |such semantic |such |and can | |and axiomatic |such |descriptions |semantic |produce some| |semantics for |semantic | |descriptions|simple | |simple PLs |descriptions| | |semantic | | | | | |descriptions| +--------------+------------+--------------+------------+------------+ #+end_scriptsize #+HTML: * Principles of programming languages Before we begin in earnest, let us set the stage by answering some basic questions. - What are “programming languages”? - What are their components? - How can we compare, group or classify them? ** A brief definition of “programming language” :Properties: #+Link: C11-standard http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf :End: - A finite, formal language for describing (potentially) infinite processes. - Usually the language is actually quite small! - A CFG for /all of C/ has only 91 nonterminals (as constructed in the C11 [[C11-standard:][international standard]], appendix 2). - We will construct simple languages with far less than 100 /productions/. - /Formality/ is required because we want our programs to be run on machines. - Machines fair poorly with informal language. - Considering just the set of commands executable by the processor, modern computers are fairly simple machines. ** The two components of every programming language Languages can be distinguished by two components: - *Syntax*: the rules for the form of legal programs. - *Semantics*: the rules for the meaning of legal programs. #+BEGIN_NOTES /Colourless green ideas sleep furiously/. #+END_NOTES Change either, and you have a different language. *** Implementation strategy: another component? Sometimes /implementation strategy/ is also considered a component of a programming language. - i.e., whether it is compiled or interpreted. We do not take this view. - Implementation strategy is external to the language. - A language may have many compilers, interpreters, runtime environments, etc. ** Paradigms: classifying programming languages Classifications of programming languages based on what kind of abstractions they (natively) support. - One language may support many paradigms. - It is often possible to write programs following a paradigm the language does not natively support. - For instance, following object oriented principles when writing C code. - Native support is important! - It frees the programmer from having to “hold” the abstraction in their own mind. - We will see time and time again that well designed languages save the programmer work and mental effort! - Supported abstractions should be /orthogonal/; separate from each other, but able to be combined in sensible, predictable ways. - Ideally there is a hierarchy of abstractions. *** The commonly discussed paradigms It is common to consider only a small number of paradigms, usually in this hierachy. +---------------------------------------+---------------------+ | Imperative/Procedural | Declarative | +-----------------+---------------------+-------------+-------+ | Object-oriented | Non-Object Oriented | Functional | Logic | +-----------------+---------------------+-------------+-------+ But this is far from a complete picture! *** A taxonomy of programming paradigms #+begin_quote #+attr_org: :width 500px #+attr_html: :width 800px #+attr_latex: :width 400px [[./images/vanroy-paradigm-chart.png]] — Van Roy, P. “Programming Paradigms for Dummies: What Every Programmer Should Know” #+end_quote *** One language, many paradigms Different paradigms lend themselves to different tasks, so it is beneficial for a language to support more than one. But then, “More is not better (or worse), just different”! #+begin_quote All computation models have their place. It is not true that models with more concepts are better or worse. This is because a new concept is like a two-edged sword. Adding a concept to a computation model introduces new forms of expression, making some programs simpler, but it also makes reasoning about programs harder. #+end_quote #+BEGIN_NOTES For example, adding mutable variables to functional languages enables object-oriented programming techniques. But reasoning in the presence of mutable variables is notoriously difficult! #+END_NOTES *** Exercise: On paradigms Pick a language you are familiar with, and research what paradigms it supports. (Don't expect to find a definite yes/no answer for every paradigm from Van Roy's taxonomy). Then try to determine paradigms which are not natively supported, but are still feasible to follow when coding. ** Subjective criteria for comparing and evaluating languages We can use a mixture of /objective/ and /subjective/ criteria to compare and evaluate languages we discuss. We use four main subjective criteria. Note there is overlap! - Readability - Writability - Reliability - Cost *** Readability - How easily can code be interpreted by humans? - How much “noise” is there? - How easily can errors be detected by visual inspection? - How many constructs must the reader be familiar with? - How many ways can constructs be combined? - Do the combinations behave in predicatable ways? The last two points deal with /orthogonality/. #+begin_quote Orthogonality in a programming language means that a relatively small set of primitive constructs can be combined in a relatively small number of ways to build the control and data structures of the language. — Concepts of Programming Languages 10th ed. #+end_quote *** Writability - How easily can humans create code? - How many abstractions are supported? - How easily can abstract ideas be captured in the language? - Of course this depends upon how well the idea fits the paradigm! - How much /syntactic sugar/ is available? - Syntactic sugar is syntax which does not add constructs/features, only (presumably simpler) ways to express existing constructs/features. #+BEGIN_NOTES For instance, the actor model of concurrency naturally fits in an object oriented language, but may not fit as well in a functional language. But the functional reactive model of concurrency will fit better in the functional paradigm. #+END_NOTES *** Reliability - To what extent is /syntactically valid/ code guaranteed to /be/ valid? - What types of errors are caught before runtime? - What tools are provided to the programmer for recovering from errors that do occur? - Are “unsafe” features included in the language? - What tools are provided/available to the programmer for /specifying/ and /verifying/ code? - Formal specification involves using mathematical language to describe the meaning of programs. - /Axiomatic semantics/. - Sometimes, automated tools can check code meets formal specifications. - See CS3EA3: Software Specifications and Correctness. *** Cost - How much work is involved in - Training programmers in a language? - Writing new code? - Writability! - Maintaining a code base? - Readability and reliability! - Are good programming tools/environments available? Do they have a cost? - How efficient are compilers for the language (if code is to be compiled)? - How efficient is compiled code (or the interpreter, if the code is to be interpreted). - What optimisation options are availble for compiled code? #+BEGIN_NOTES There is a tradeoff between efficiency of the compiler and the amount it optimises code. - For ease of development, you want quick compilation, and can sacrifice efficiency of the compiled code. - For a good end product, you want efficient compiled code, even if compilation takes longer. #+END_NOTES * Key concepts and terminology Following are some key concepts and terms which - we will either see repeatedly throughout the course, or - which we should clarify before we move on. (Some of this information simply won't have a better place to go). ** Compiled, interpreted, and hybrid approaches Languages exist in a hierarchy. - We are focused on the “upper portion” of the hierarchy. - High-level languages! - We may use the term low-level for some languages, C in particular, but this is relative. - C is much higher level than assembly! - We need to translate into machine code to run our programs. - Compilation, interpretation and “hybrid approaches” are strategies for doing this. *** Compilation Translate the whole program (and any libraries or other code resources needed) ahead of running it. - High upfront cost (time), for increased efficiency at runtime - Not portable; machine code is machine dependent. *** Interpreters Translate the program /as we are running it/. - No upfront cost, but less efficient. - Portable; can be run on any machine with an interpreter. - Alleviates some of the programmer's responsibility. - One user (or group) writes the interpreter /once/ (per machine type); it can be used by any number of users for any number programs. - Efficiency is improved by using *just-in-time compilation*. - Store the result of interpretation so it can be used again. - Can achieve better error reporting. - Relationship between original and translated codes is known at runtime. - This relationship is discarded when compiling code. *** Hybrid methods Compile into an intermediate language, which is then translated into assembly. - This intermediate language is usually similar to assembly. - But targets a virtual machine, not actual hardware! - Usually called /bytecode/. - Greatly offsets efficiency cost of interpretation. - More portable than compiled code; just need a bytecode interpreter for each target machine. *** Exercise: Researching implementations Research for yourself what implementations are available for languages you are familiar with. Consider: - Can you find different approaches (compiler, interpreter, hybrid) for the language? - Is one approach “the norm” or “preferred” for the language? - What is the availability for each implementation? - Who maintains them? - Are they open source? ** Abstraction The concept of an /abstraction/ is central to programming languages. #+begin_quote We define an abstraction loosely as a tool or device that solves a particular problem. — Concepts, Techniques, and Models of Computer Programming #+end_quote #+begin_quote Briefly, abstraction means the ability to define and then use complicated structures or operations in ways that allow many of the details to be ignored. — Concepts of Programming Languages 10th ed. #+end_quote *** Levels of abstraction +----------------------------+ | Highly abstract | | (User knows fewer details) | +----------------------------+ | A writing instrument | | ↓ | | A pencil | | ↓ | | A mechanical pencil | | ↓ | | This pencil | +----------------------------+ | Highly concrete | | (User knows all details) | +----------------------------+ Language ~A~ is “higher-level” than language ~B~ if it allows you to work at a higher level of abstraction than language ~B~. #+BEGIN_NOTES For instance, C++ and C# are higher-level than C, because as a modularisation construct, classes hide more implementation details than the C alternative of defining structs and procedures using them. (Pure) functional languages are higher-level than imperative languages, because the user need not be aware of the state of memory. Logic lanauges are even higher-level than functional languages, because the programmer does not concern themselves with the algorithm, only the problem description. #+END_NOTES ** Static vs. dynamic We will see these terms in particular applied to many concepts. - Static: set before runtime; fixed throughout the runtime. - Sometimes the phrase “compile time” is used. - But “static” may also mean “load time”. - I.e., the time at which the program is loaded into memory. - And if we are using an interpreter, there is no “compile time”! - Dynamic: set during runtime; (potentially) changes throughout runtime. ** Scope The /scope of an entity/ is the portion of the program in which it is “visible”. - Usually scope is statically determined. - So the scope of entities can be determined by reading the code. - With dynamic scoping, the scope of entities depends upon the “execution trace”, i.e., the “path” through the program. - Dynamic scoping is definitely the exception. - The detriments strongly outweigh the benefits. A /scope/ is a portion of the program to which the visibility of entities may be limited. - Design decision: what constructs introduce scopes? - Almost certainly subroutines do. - Do conditional branches? Loop bodies? *** Exercise: On dynamic scoping - Historically, Lisps (languages directly descended from Lisp) were dynamically scoped. - Presently, the only widely used Lisp which uses dynamic scoping by default is Emacs Lisp (eLisp). Try out this snippet of Emacs lisp code, either in Emacs or using an [[https://repl.it/languages/elisp][online REPL]]. #+begin_src emacs-lisp (let ((x 2) (y 5)) ; "global" variables x and y (progn (defun mult-x-y () (* x y)) ; returns x * y (defun A () (let ((x 3)) ; local variable x (mult-x-y))) (defun B () (let ((y 10)) ; local variable y (A))) (message "mult-x-y returns %d, A returns %d and B returns %d" (mult-x-y) (A) (B)))) #+end_src 1) Ensure you understand the results. 2) Using this understanding, formulate some advantages and disadvantages of dynamic scoping. *** Exercise: Entities which introduce scope In a few different programming languages, investigate whether condition branches and loop bodies introduce scopes. I.e., test out code such as #+begin_src text if B then int x = 0 endif y = x % Is x still in scope? #+end_src Specifically, investigate languages which have /iterating loops/, (usually called ~for~ loops). What is the scope of an iterator for such loops? ** Variables and memory - A /variable/ is an abstraction of a memory cell. - Come in many forms. - Typically we use /variable/ to mean a named, mutable cell of memory. - When possible, we use more specific words. - Argument, parameter, constant, reference, etc. - Or qualify the word variable. *** The data segment, stack, and heap In the context of running a program (or a thread of a program): - The /data segment/ is a portion of memory used to store the program itself and static/global variables. - The /stack/ is an organised portion of memory allocated for the program. - Blocks at the “top” of the stack are reserved when entering units of code. - Units of code include subroutines and/or unnamed blocks, depending upon the language. - They are /popped off/ the stack when leaving the unit of code. - The /heap/ is an unorganised portion of memory allocated for the program. - Allocation/deallocation may be independent of units of code. - Disorganisation can lead to inefficiency and other problems. - Cache misses, page faults, etc. - Running out of memory or appropriately sized continguous blocks. - Garbage, dangling references, etc. *** Kinds of memory allocation We can distinguish four kinds of memory allocation. - Static - Allocation is done /before runtime/ (static), in the data segment. - At load time, when the program is loaded into memory. (Not compile time). - Stack dynamic - Allocation is dynamic, automatic, and on the stack. - Amount of memory needed must be known when entering their scope. - This is “the usual method” in most imperative languages. #+REVEAL: split:t - Implicit heap dynamic - Allocation is dynamic, automatic, and on the heap. - Memory is simply allocated when the variable is assigned; the programmer does not need to “do anything”. - Explicit heap dynamic - Allocation is dynamic, /manual/, and on the heap. - The programmer must use a command to allocate memory /before/ assignments can be made. - E.g., ~malloc~ or ~new~. - In fact these blocks of memory can be considered /nameless variables/; we need other, /reference/ variables to /point/ to their address. *** Exercise: Memory allocation in C++ Investigate: - Which of the five kinds of memory allocation are available in C++. - How each kind of memory allocation is denoted. *** Lifetime The /lifetime/ of a variable is the portion of the runtime (not the code) during which it has memory allocated for it. - There may be several /instances/ of a single variable in the code, each with its own lifetime. - The lifetime of a variable depends upon the memory allocation strategy used for it. - Static: lifetime is the whole of runtime. - Stack dynamic: lifetime is the portion of runtime between when the unit of code containing the variable is entered and when control returns to the invoker of the unit of code. - In the case of functions/procedures, local variables end their lifetime when the function/procedure returns. - Note that a variable go out of scope, but still be alive. - Implicit heap-dynamic: lifetime begins when assigned. The end of lifetime depends upon garbage collection. - Explicit heap-dynamic: lifetime begins when the allocation command is used. The end of lifetime may be - when the programmer issues a deallocation command, or - depend upon garbage collection. *** Garbage collection #+begin_definition /Garbage/: allocated memory which is no longer accessible. #+end_definition #+begin_definition /Garbage collection/: deallocating garbage. #+end_definition - A complex activity. - We will discuss the basics here, and then likely not touch on it again. - Has a large influence on the efficiency of compiled/interpreted code. - Two main categories of garbage collection algorithms. - Tracing - Which includes the “mark & sweep” algorithm. - Reference counting - Additionally, language implementations may include compile time “escape analysis”. - Memory may be allocated off the stack instead of the heap /if/ no references to the variable are available outside its scope (if it doesn't “escape”). #+REVEAL: split:t - Tracing - Usually triggered when a certain memory threshold is reached. - Starting from some set of base references (usually those in the stack and data segment), “trace” references, somehow marking those that are reachable. - Starting from the set of base references, “determine the transitive closure of reachability”. - Once done, free memory that is not marked. - “Naive” mark & sweep: - Each object in memory has a bitflag for this “marking”, kept clear except during garbage collection. - When activated, traverse all reachable memory, “marking” what's reached, then “sweep” away all unmarked memory (and reset the flags of marked memory). - Have to pause the program to do this. - Better methods do not require pausing. #+REVEAL: split:t - Reference counting - An “eager” approach. - As the program runs, keep a count of the number of references to every object in memory. - Must be constantly monitoring! - When the number of references reaches zero, can free the memory for that object. - Have to account for cycles! - Which increases the overhead. * Exercises - [[Exercise: On paradigms][On paradigms]] - [[Exercise: Researching implementations][Researching implementations]] - [[Exercise: On dynamic scoping][On dynamic scoping]] - [[Exercise: Entities which introduce scope][Entities which introduce scope]] - [[Exercise: Memory allocation in C++][Memory allocation in C++]]