LLVM pitanja

Opšta pitanja


What is a compiler

A compiler is a software tool that translates source code written in a high-level programming language (like C, C++, or Java) into machine code or an intermediate code that can be executed by a computer. It reads the entire program, analyzes it for syntax and semantic errors, and converts it into an executable file or code that the computer’s processor can understand and run.


Explain the stages of compiler (lexical, parsing, semantic, optimization, code generation and code emission)

The compilation process is typically divided into several stages, each with a specific function:

  1. Lexical Analysis:
  2. Parsing:
  3. Semantic Analysis:
  4. Optimization:
  5. Code Generation:
  6. Code Emission:

Each of these stages plays a crucial role in transforming human-readable source code into a program that can be executed by a computer.


Difference between a complier and an interpreter

Compiler: Translates the entire source code of a program into machine code or an intermediate code in one go and produces an executable file. This means that the program must be fully compiled before it can be run.

Interpreter: Translates and executes the source code line by line or statement by statement. It reads and processes the code sequentially and does not produce an intermediate executable file.

  1. Ease of Debugging:
  2. Platform Independence:
  3. Portability:
  4. Faster Development Cycle:
  5. Interactive Execution:
  6. Dynamic Typing and Flexibility:
  7. Scripting and Automation

JIT vs AOT


LLVM Specific

What is LLVM

LLVM predstavalja kompajlersku infrastrukturu koju cine veliki broj biblioteka i alata. Ideja je ona bude reusable. Svaki deo da je neka biblioteka i veliki deo koda je ponovo upotrebljiv. Sluzi za pravljenje komplajlera, alata za analizu koda…


LLVM Core components

  1. LLVM Core Libraries: Provide the foundational tools for building and manipulating the LLVM intermediate representation (IR).
  2. LLVM IR (Intermediate Representation): A low-level, platform-independent representation of code used for optimization and code generation.
  3. LLVM Compiler Frontend: Converts source code from high-level languages to LLVM IR (e.g., Clang for C/C++).
  4. LLVM Optimizer: Performs various optimizations on LLVM IR to improve performance and efficiency.
  5. LLVM Code Generator: Translates optimized LLVM IR into machine code for different target architectures.
  6. LLVM Backends: Implement platform-specific code generation for various hardware architectures (e.g., x86, ARM).
  7. LLVM Runtime Library: Provides runtime support functions for programs compiled with LLVM.
  8. Clang: A popular frontend that compiles C, C++, and Objective-C code to LLVM IR.
  9. LLD: The LLVM linker that combines object files into executables.

What are the advantages of using llvm over traditional compilers?

Advantages of using LLVM over traditional compilers:

  1. Modularity: LLVM’s modular architecture allows developers to easily add new optimizations, backends, and frontends, making it highly extensible.
  2. Intermediate Representation (IR): The use of LLVM IR simplifies cross-platform compilation and enables powerful optimization techniques that are easier to implement than in traditional compilers.
  3. Reusable Components: LLVM provides reusable libraries for parsing, optimizing, and generating code, reducing the need to write custom compiler code from scratch.
  4. Multi-Language Support: LLVM can support multiple programming languages (e.g., C, C++, Rust, Swift) through various frontends like Clang, enabling seamless integration and cross-language compilation.
  5. Target Independence: LLVM IR can be compiled to various machine architectures using different backends, making it easy to generate code for multiple platforms.
  6. Advanced Optimizations: LLVM has sophisticated optimization passes that can improve both performance and code size, leveraging techniques like Just-In-Time (JIT) compilation.
  7. Tooling: LLVM includes tools for code analysis, static analysis, and debugging, which are useful for development beyond just compiling code.
  8. Community and Ecosystem: LLVM has a large and active community that contributes to its continuous improvement and support, making it easier to find help and resources.

What are different types of LLVM IR?


How does llvm perform loop optimization?

  1. Loop Unrolling: Expands the loop body by duplicating the loop body multiple times to reduce loop control overhead and increase instruction-level parallelism.
  2. Loop Fusion: Merges adjacent loops that iterate over the same range to reduce the overhead of loop control and improve cache performance.
  3. Loop Invariant Code Motion: Moves computations that do not change within the loop (e.g., const calculations) outside the loop to reduce redundant work.
  4. Loop Vectorization: Converts scalar operations in the loop to vector operations, allowing parallel processing using SIMD (Single Instruction, Multiple Data) instructions for better CPU utilization.
  5. Loop Peeling: Splits the loop into separate parts to simplify optimization (e.g., handling edge cases that occur in the first few iterations separately).
  6. Loop Blocking/Tiling: Breaks loops into smaller chunks (blocks or tiles) to improve cache locality by ensuring that data accessed by a loop fits into the cache more effectively.
  7. Dead Code Elimination: Removes computations within loops that do not affect the final result, reducing the workload and improving execution speed.

Koje optimizacije postoje u llvm-u?


Kako LLVM optimizuje petlje


Šta je llvm-ova optimization pipeline?

TODO:


Write a simple Pass for LLVM

#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/Module.h"
#include "llvm/Support/raw_ostream.h"

using namespace llvm;

namespace {
  struct SimpleFunctionPass : public FunctionPass {
    // Pass ID (used to identify the pass)
    static char ID;
    SimpleFunctionPass() : FunctionPass(ID) {}

    // The main function that performs the pass
    bool runOnFunction(Function &F) override {
      // Print the function name to the standard error
      errs() << "Function: " << F.getName() << "\n";
      // We do not modify the function, so return false
      return false;
    }
  };
}

// Initialize the pass (this is a necessary step)
char SimpleFunctionPass::ID = 0;

// Register the pass with LLVM
static RegisterPass<SimpleFunctionPass> X("<name>", "Docs", false, false);

Statička i dinamička analiza kod kompajlera

Key Differences:


LLVM Passes

Kako da se testira novi pass?

Unit test (llvm-lit), bechmark it (llvm-exegesis), testira se sa opt alatom.

LLVM codegen