Generate x86 Assembly with Go

avo
Build Status go.dev Go Report Card

Generate x86 Assembly with Go

avo makes high-performance Go assembly easier to write, review and maintain. The avo package presents a familiar assembly-like interface that simplifies development without sacrificing performance:

  • Use Go control structures for assembly generation; avo programs are Go programs
  • Register allocation: write functions with virtual registers and avo assigns physical registers for you
  • Automatically load arguments and store return values: ensure memory offsets are correct for complex structures
  • Generation of stub files to interface with your Go package

For an introduction to avo watch the dotGo 2019 talk Better x86 Assembly Generation with Go (slides), or see the Gophercon 2019 talk (slides) for a longer tutorial. To discuss avo and general Go assembly topics, join us in the #assembly channel of Gophers Slack.

Note: APIs subject to change while avo is still in an experimental phase. You can use it to build real things but we suggest you pin a version with your package manager of choice.

Quick Start

Install avo with go get:

$ go get -u github.com/mmcloughlin/avo

avo assembly generators are pure Go programs. Here's a function that adds two uint64 values:

// +build ignore

package main

import . "github.com/mmcloughlin/avo/build"

func main() {
	TEXT("Add", NOSPLIT, "func(x, y uint64) uint64")
	Doc("Add adds x and y.")
	x := Load(Param("x"), GP64())
	y := Load(Param("y"), GP64())
	ADDQ(x, y)
	Store(y, ReturnIndex(0))
	RET()
	Generate()
}

go run this code to see the assembly output. To integrate this into the rest of your Go package we recommend a go:generate line to produce the assembly and the corresponding Go stub file.

//go:generate go run asm.go -out add.s -stubs stub.go

After running go generate the add.s file will contain the Go assembly.

// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.

#include "textflag.h"

// func Add(x uint64, y uint64) uint64
TEXT ·Add(SB), NOSPLIT, $0-24
	MOVQ x+0(FP), AX
	MOVQ y+8(FP), CX
	ADDQ AX, CX
	MOVQ CX, ret+16(FP)
	RET

The same call will produce the stub file stub.go which will enable the function to be called from your Go code.

// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.

package add

// Add adds x and y.
func Add(x uint64, y uint64) uint64

See the examples/add directory for the complete working example.

Examples

See examples for the full suite of examples.

Slice Sum

Sum a slice of uint64s:

func main() {
	TEXT("Sum", NOSPLIT, "func(xs []uint64) uint64")
	Doc("Sum returns the sum of the elements in xs.")
	ptr := Load(Param("xs").Base(), GP64())
	n := Load(Param("xs").Len(), GP64())

	Comment("Initialize sum register to zero.")
	s := GP64()
	XORQ(s, s)

	Label("loop")
	Comment("Loop until zero bytes remain.")
	CMPQ(n, Imm(0))
	JE(LabelRef("done"))

	Comment("Load from pointer and add to running sum.")
	ADDQ(Mem{Base: ptr}, s)

	Comment("Advance pointer, decrement byte count.")
	ADDQ(Imm(8), ptr)
	DECQ(n)
	JMP(LabelRef("loop"))

	Label("done")
	Comment("Store sum to return value.")
	Store(s, ReturnIndex(0))
	RET()
	Generate()
}

The result from this code generator is:

// Code generated by command: go run asm.go -out sum.s -stubs stub.go. DO NOT EDIT.

#include "textflag.h"

// func Sum(xs []uint64) uint64
TEXT ·Sum(SB), NOSPLIT, $0-32
	MOVQ xs_base+0(FP), AX
	MOVQ xs_len+8(FP), CX

	// Initialize sum register to zero.
	XORQ DX, DX

loop:
	// Loop until zero bytes remain.
	CMPQ CX, $0x00
	JE   done

	// Load from pointer and add to running sum.
	ADDQ (AX), DX

	// Advance pointer, decrement byte count.
	ADDQ $0x08, AX
	DECQ CX
	JMP  loop

done:
	// Store sum to return value.
	MOVQ DX, ret+24(FP)
	RET

Full example at examples/sum.

Features

For demonstrations of avo features:

  • args: Loading function arguments.
  • returns: Building return values.
  • complex: Working with complex{64,128} types.
  • data: Defining DATA sections.
  • ext: Interacting with types from external packages.
  • pragma: Apply compiler directives to generated functions.

Real Examples

Implementations of full algorithms:

Contributing

Contributions to avo are welcome:

Credits

Inspired by the PeachPy and asmjit projects. Thanks to Damian Gryski for advice, and his extensive library of PeachPy Go projects.

License

avo is available under the BSD 3-Clause License.

Owner
Comments
  • all: support AVX-512

    all: support AVX-512

    For complexity reasons AVX-512 was not initially considered. We should add support.

    https://github.com/mmcloughlin/avo/blob/9fbb71b6db924e5524188d05eeef06b619cf6f79/internal/load/load.go#L130-L133

  • question: pointer to struct in a different package

    question: pointer to struct in a different package

    Migrating question from reddit.

    Hi, again thanks for your time :) I've run into a slight problem which might just be me not understanding the docs. Is there a way of using Param("") and .Field() through a pointer? I've got:

    Package("github.com/jamiec7919/vermeer/core")
    TEXT("intersectBoxes", NOSPLIT, "func (ray *Ray, boxes *[4*3*2]float32, hits *[4]int32, t *[4]float32)")
    Doc("intersectBoxes evaluates the given..")
    // MOVSS (AX),X3 // X3 = ray.O[0]
    rayOX := Load(Param("ray").Field("O").Index(0), XMM())
    

    The Ray type (in core) is something like:

    type Ray struct {
    O [3]float32
    }
    

    But I can't work out how to get a reference to O[0] etc. via the Param. Keep getting 'not struct type' etc. I tried various combos like Base() and Index(0) just in case but no good :)

    Also it didn't seem to like me including the package name in the func signature:

    TEXT("intersectBoxes", NOSPLIT, "func (ray *core.Ray, boxes *[4*3*2]float32, hits *[4]int32, t *[4]float32)")
    

    The asm code is not in the core package, but another package that references a public type from core.

    Errors out with:

    intersect_asm.go:12: eval:1:12: undeclared name: core
    intersect_asm.go:16: unknown variable "ray"
    intersect_asm.go:20: unknown variable "ray"
    exit status 1
    
  • instructions:

    instructions: "ANDQ imm64 r64" and "MOVD r32 xmm" unsupported

    The go assembler supports both of the following instructions but avo does not.

    • ANDQ imm64, r64
    • MOVD r32, xmm

    I see that x86/zctors.go is autogenerated, but I'm a bit lost on how to update the generator to resolve this. To make sure that valid output would be generated I modified it by hand and the generated code was properly assembled:

    diff --git a/x86/zctors.go b/x86/zctors.go
    index 6d03480..f52ba55 100644
    --- a/x86/zctors.go
    +++ b/x86/zctors.go
    @@ -1280,6 +1280,7 @@ func ANDPS(mx, x operand.Op) (*intrep.Instruction, error) {
     // 	ANDQ imm32 rax
     // 	ANDQ imm8  r64
     // 	ANDQ imm32 r64
    +// 	ANDQ imm64 r64
     // 	ANDQ r64   r64
     // 	ANDQ m64   r64
     // 	ANDQ imm8  m64
    @@ -1308,6 +1309,13 @@ func ANDQ(imr, mr operand.Op) (*intrep.Instruction, error) {
     			Inputs:   []operand.Op{mr},
     			Outputs:  []operand.Op{mr},
     		}, nil
    +	case operand.IsIMM64(imr) && operand.IsR64(mr):
    +		return &intrep.Instruction{
    +			Opcode:   "ANDQ",
    +			Operands: []operand.Op{imr, mr},
    +			Inputs:   []operand.Op{mr},
    +			Outputs:  []operand.Op{mr},
    +		}, nil
     	case operand.IsR64(imr) && operand.IsR64(mr):
     		return &intrep.Instruction{
     			Opcode:   "ANDQ",
    @@ -8450,6 +8458,7 @@ func MOVBWZX(mr, r operand.Op) (*intrep.Instruction, error) {
     // 	MOVD imm32 m64
     // 	MOVD r64   m64
     // 	MOVD xmm   r64
    +// 	MOVD r32   xmm
     // 	MOVD r64   xmm
     // 	MOVD xmm   xmm
     // 	MOVD m64   xmm
    @@ -8505,6 +8514,13 @@ func MOVD(imrx, mrx operand.Op) (*intrep.Instruction, error) {
     			Inputs:   []operand.Op{imrx},
     			Outputs:  []operand.Op{mrx},
     		}, nil
    +	case operand.IsR32(imrx) && operand.IsXMM(mrx):
    +		return &intrep.Instruction{
    +			Opcode:   "MOVD",
    +			Operands: []operand.Op{imrx, mrx},
    +			Inputs:   []operand.Op{imrx},
    +			Outputs:  []operand.Op{mrx},
    +		}, nil
     	case operand.IsR64(imrx) && operand.IsXMM(mrx):
     		return &intrep.Instruction{
     			Opcode:   "MOVD",
    
  • reg,pass: refactor allocation of aliased registers

    reg,pass: refactor allocation of aliased registers

    Issue #100 demonstrated that register allocation for aliased registers is fundamentally broken. The root of the issue is that currently accesses to the same virtual register with different masks are treated as different registers. This PR takes a different approach:

    • Liveness analysis is masked: we now properly consider which parts of a register are live
    • Register allocation produces a mapping from virtual to physical ID, and aliasing is applied later
  • operand: unable to generate

    operand: unable to generate "ANDQ $-32, DI"

    I am porting the runtime memmove. I have come across ANDQ $-32, DI which I suspect get translated to 0xffffffffffffffe0.

    I have not found a way to enter that instruction. All fail with "ANDQ: bad operands"

    I have tried ANDQ(I8(-32), dst), ANDQ(I64(-32), dst), ANDQ(U64(0xffffffffffffffe0), dst).

  • pass: unexpected register allocation failure

    pass: unexpected register allocation failure

    @klauspost reported an unxepected register allocation failure in the following slack thread:

    https://gophers.slack.com/archives/C6WDZJ70S/p1576430114006900?thread_ts=1576361104.001800&cid=C6WDZJ70S

    The following code illustrates the problem (look for the FIXME comments)

    https://github.com/klauspost/compress/pull/186/files#diff-60d6e6209d2fc7873851f07b946a5dc7R230

    After an initial analysis using the experimental debug printer #6 it appears we are getting duplicate entries in the live register sets, for example <virtual:20:1:8> below.

    ...
    		instruction {
    			addr        = 0x0xc00010ed20
    			opcode      = "CMPB"
    			terminal    = false
    			branch      = false
    			conditional = false
    			operands {
    				0: CL
    				1: DL
    			}
    			inputs {
    				0: <virtual:22:1:1>
    				1: <virtual:23:1:1>
    			}
    			outputs {
    			}
    			pred {
    				0x0xc00010ec80
    			}
    			succ {
    				0x0xc00010edc0
    			}
    			livein {
    				size = 14
    				<virtual:30:1:1>
    				<virtual:20:1:8>
    				<virtual:21:1:8>
    				FP
    				<virtual:20:1:8>
    				<virtual:22:1:1>
    				<virtual:30:1:4>
    				SP
    				<virtual:19:1:8>
    				<virtual:29:1:2>
    				<virtual:23:1:1>
    				<virtual:29:1:4>
    				<virtual:6:1:8>
    				<virtual:29:1:1>
    			}
    			liveout {
    				size = 12
    				<virtual:19:1:8>
    				<virtual:21:1:8>
    				<virtual:20:1:8>
    				<virtual:29:1:1>
    				<virtual:29:1:4>
    				<virtual:29:1:2>
    				<virtual:6:1:8>
    				<virtual:30:1:1>
    				<virtual:30:1:4>
    				<virtual:20:1:8>
    				SP
    				FP
    			}
    		}
    ...
    

    Related #43 #65

  • Support AVX512_BITALG instructions and relax base register req in VM operands

    Support AVX512_BITALG instructions and relax base register req in VM operands

    This PR updates the AVX512 branch only, and brings the following:

    • Relaxes the current requirement that VM operands in VPSCATTER/GATHER have a base register.

      • This is not required by AVX512 or the golang assembler, as discussed here: https://github.com/mmcloughlin/avo/issues/193
    • Adds support for AVX512_BITALG instructions / forms.

      • Adds VPOPCNTB/W byte/word forms of VPOPCNT, including AVX512VL 128/256-bit wide (xmm/ymm) forms
      • Adds AVX512VL 128/256-bit wide (xmm/ymm) forms to the existing supported AVX512_VPOPCNTDQ instructions.
      • Adds new instruction VPSHUFBITQMB with its AVX512VL 128/256-bit wide (xmm/ymm) forms.
      • Support for these instructions/forms was hand edited into internal/data/x86_64.xml as that seemed the path of least resistance, and scripts/generate was run to regenerate build/zinstructions.go et al.

    Additionally it:

    • Merges in all changes from the current master branch
    • Updates CI for go1.17 (from go1.17 branch)
  • reg: same physical register for two different virtual registers

    reg: same physical register for two different virtual registers

    Hello, I'm currently trying to implement md5 with avo (just to learn), and I found this strange behavior.

    // some code
    
    a, b, c, d := GP32(), GP32(), GP32(), GP32()
    
    // some other code
    
    t := GP32()
    MOVL(a, t)
    

    In this case, a and t will have the same physical register (not expected), so the MOVL instruction will look like this for example: MOVL DI, DI

    But if I duplicate the MOVL instruction, a and t will have two different physical registers.

    // some code
    
    a, b, c, d := GP32(), GP32(), GP32(), GP32()
    
    // some other code
    
    t := GP32()
    MOVL(a, t)
    MOVL(a, t)
    

    The two MOVL instructions will look like this: MOVL DI, R12 MOVL DI, R12

    I didn't manage to reproduce the problem in a simpler example, so here is my code: https://github.com/etiennedaspe/md5-avo/blob/bd5eb593cc060a5f687d7f08604f2cab89fcb10e/asm.go

    The initialization of 'a' is on line 28, and the initialization of 't' and the MOVL instruction are on lines 88-89.

  • instructions: GFNI

    instructions: GFNI

    I would like to use GFNI for my reedsolomon package and they have a variety of other interesting uses

    It seems like the Go Assembler supports the instructions, but avo doesn't.

    It seems like they are missing from https://github.com/Maratyszcza/Opcodes

    I wouldn't mind sending a PR, but without an upstream change I don't see any "sustainable" way to add them.

  • doc: broken link to Filippo's live coding session

    doc: broken link to Filippo's live coding session

    Hi, It seems the link to Fillipo's live coding session of AVO, which is present in the README, is dead. I've looked for another link to it but I couldn't find it.

  • bug: Function signatures generate incorrect argument size

    bug: Function signatures generate incorrect argument size

    avo: TEXT("Example", NOSPLIT|NOFRAME, "func(x *uint64, y uint32)") asm: TEXT ·Example(SB), NOSPLIT|NOFRAME, $0-16

    go vet complains ./bug.s:6:1: [amd64] Example: wrong argument size 16; expected $...-12

    This may very well be go vet being silly, but I'm not sure if there's supposed to be padding there or not.

  • Add Printer hook enabling custom user defined file output like -stubs

    Add Printer hook enabling custom user defined file output like -stubs

    This PR contains a minimal feature set implementing a general hook enabling user defined solutions for:

    • Cpu feature-checks and runtime dispatch helpers (ie. #168)
    • Test generation
    • Documentation generation
    • Etc...

    In a nutshell this PR does two things:

    1. Defines a new global function AddPrinter, which allows the user to define a new command line flag/description, and define a Builder to handle file generation.
    // AddPrinter registers a custom printer
    func AddPrinter(flag, desc string, pB printer.Builder, dflt io.WriteCloser)
    
    1. In support of the above, lifts the former avo/internal/prnt package to the new externally exported avo/printer package. All former internal references to prnt now use printer instead and internal/prnt is eliminated.

    That's it! I've been using this on my fork for PureGo+CPUID Dispatch and Test generation (driven by text/template) for many months, and it has met all of my needs. Here's what setting that up looks like:

    	pureGoGen := configurePureGoGeneration(variants, bitWidths, queryWidths, precisions, codeGenerated)
    	avo.AddPrinter("purego", "produce file of golang implementations", pureGoGen, nil)
    	testGen := configureTestGeneration(variants, bitWidths, queryWidths, codeGenerated)
    	avo.AddPrinter("tests", "produce file of tests", testGen, nil)
    	avo.Generate()
    

    The "Generation" functions use the newly exported avo/printer sub-package to set up the generation code just the way my project needs, and that's it!

    This is the last PR (in conjunction with #234 and #233) that I need to stop maintaining my own Avo fork. I don't currently require the functionality in #349, but I decided to just do it as a bonus while I had all of the opcodesextra state mentally loaded up.

    Enjoy! And I'm happy to discuss and implement any further suggestions. My goal here was to be minimally disruptive to Avo while providing maximum flexibility to advanced users who need more code generation flexibility.

  • Add support for AVX512 Vpclmulqdq, Vbmi2, Vnni, & Vaes instructions

    Add support for AVX512 Vpclmulqdq, Vbmi2, Vnni, & Vaes instructions

    This PR, in conjunction with #234, rounds out the "missing" AVX512 extensions which are supported by the Golang assembler, but not present in the X86_64.xml file used to generate Avo's instruction support files.

    With the addition of the instructions in these two PRs (plus those merged in #344), Avo will support all of the AVX512 instructions up through the Intel Sunny Cove (Ice & Rocket Lake) and AMD Zen4 processor lines.

    Instructions were added using the same opcodesextra mechanism as was used in the GFNI PR.

  • Added go-recipes badge

    Added go-recipes badge

    Hello, Hi!! 👋🏻

    I like your project and I think broader Go community will benefit from it too.

    Thus, I added it to the curated list of Go tools.

    I hope this badge will serve as a mark of quality and appreciation to your project.

    Once again, thank you for your work!!

    ❤️

    -- Nikolay

  • How to generate stub.go files with proper imports?

    How to generate stub.go files with proper imports?

    The provided example project that shows how to use custom types in avo generated asm functions depends on using a handcrafted stub.go file that is not overwritten by using a -stubs parameter.

    It would be a very handy minor change to be able to specify one or more import statements to insert in generated stubs so that large/complex stub files don't need to be hand maintained as the example requires.

    I'm imagining something like:

    //go:build ignore
    // +build ignore
    
    package main
    
    import . "github.com/mmcloughlin/avo/build"
    
    func main() {
    	Package("github.com/mmcloughlin/avo/examples/ext")
    	Import("github.com/mmcloughlin/avo/examples/ext/ext")  // <-- Specify where type(s) defined
    
    	TEXT("StructFieldB", NOSPLIT, "func(e ext.Struct) byte")
    	Doc("StructFieldB returns field B.")
    	b := Load(Param("e").Field("B"), GP8())
    	Store(b, ReturnIndex(0))
    	RET()
    	Generate()
    }
    

    Which could then generate with: //go:generate go run asm.go -out ext.s -stubs stub.go and generate a corresponding stub:

    // Code generated by command: go run asm.go -out ext.s -stubs stub.go. DO NOT EDIT.
    
    package ext
    
    import "github.com/mmcloughlin/avo/examples/ext/ext"
    
    // StructFieldB returns field B.
    func StructFieldB(e ext.Struct) byte
    
  • question: diasallow automatic use of certain register(s)

    question: diasallow automatic use of certain register(s)

    Is this possible to teach avo not to use the given set of registers during the register allocation? Think about a custom calling convention when we reserve some registers for internal use (a programmer still can reference these registers using explicit names).

Compiler for a small language into x86-64 Assembly

Compiler This project is a small compiler, that compiles my own little language into X86-64 Assembly. It then uses yasm and ld to assemble and link in

Dec 13, 2022
Compiler for a small language into x86-64 Assembly

Compiler This project is a small compiler, that compiles my own little language into X86-64 Assembly. It then uses yasm and ld to assemble and link in

Dec 13, 2022
Experimental system call tracer for Linux x86-64, written in Go

gtrace A system call tracer for Linux x86-64. DISCLAIMER: This software is experimental and not considered stable. Do not use it in mission-critical e

Nov 29, 2022
🏃 An x86-64 assembler written in Go.

asm An x86-64 assembler written in Go. It is used by the Q programming language for machine code generation. Architectures Linux x86-64 (ELF binaries)

Nov 7, 2022
A Go unikernel running on x86 bare metal
A Go unikernel running on x86 bare metal

EggOS A Go unikernel running on x86 bare metal Run a single Go applications on x86 bare metal, written entirely in Go (only a small amount of C and so

Jan 6, 2023
Nanovms running in Docker x86 container for M1 Mac ARM64.

Docker Ops This project is an attempt to enable Nanos unikernels to be managed by Ops on non-intel architectures such as the Mac M1 ARM64. Unless ther

Nov 22, 2021
Make HMCL working in Apple Silicon Mac without x86 Java

M1MC Apple have used arm64 architecture on their new Macs. But Minecraft have no

Sep 19, 2022
Go Lang Web Assembly bindings for DOM, HTML etc

WebAPI Go Language Web Assembly bindings for DOM, HTML etc WARNING: The current API is in very early state and should be consider to be expremental. T

Dec 28, 2022
Golang evasion tool, execute-assembly .Net file

?? Frog For Automatic Scan ?? Doge For Defense Evasion&Offensive Security Doge-Assembly Golang evasion tool, execute-assembly .Net file Intro Are you

Jan 8, 2023
Assembly syntax that makes you feel like you're writing code in a high-level language.

shasm Assembly syntax that makes you feel like you're writing code in a high-level language. Shasm is not an Assembler. Shasm simply compiles Shasm sy

Jun 5, 2021
Framework for building distributed services with Web Assembly
Framework for building distributed services with Web Assembly

Tarmac Framework for building distributed services with Web Assembly Tarmac is a unique framework designed for the next generation of distributed syst

Dec 31, 2022
Assembly-optimized MD4 hash algorithm in Go

md4 MD4 hash algorithm in Go. Assembly-optimized for amd64 platforms. MD4 is cryptographically broken and should should only be used where compatibili

Apr 14, 2022
A program to create assembly 8086 strings to print without using any printing/strings related function but only mov-xchg-int and loops

Assembly String builder tool A program to create assembly 8086 strings to print without using any printing/strings related function but only mov-xchg-

Feb 1, 2022
The Bhojpur Ara is a software product used for automated resource assembly within Bhojpur.NET Platform ecosystem to enable delivery of applications and services.

Bhojpur Ara - Automated Resource Assembly The Bhojpur Ara is a service product used for automated resource assembly within the Bhojpur.NET Platform ec

Apr 28, 2022
An efficient Go Rapid Product Assembly system used within the Bhojpur.NET Platform ecosystem.

Bhojpur GoRPA - Builder, Packager, Assembler An efficient Go-based Rapid Product Assembly software tool used within the Bhojpur.NET Platform ecosystem

Apr 28, 2022
RISC-V meta assembler that adds quality of life features to assembly

Lox language TODOs (Partially complete) unreachable code. if a "return" has been found in a local scope and we encounter other code directly following

Jan 11, 2022
✨ Generate unique IDs (Port of Node package "generate-snowflake" to Golang)

✨ Generate Snowflake Generate unique IDs. Inspired by Twitter's Snowflake system. ?? Installation Initialize your project (go mod init example.com/exa

Feb 11, 2022
Generate flags by parsing structures

Flags based on structures. The sflags package uses structs, reflection and struct field tags to allow you specify command line options. It supports di

Nov 24, 2022
Automatically generate Go (golang) struct definitions from example JSON

gojson gojson generates go struct definitions from json or yaml documents. Example $ curl -s https://api.github.com/repos/chimeracoder/gojson | gojson

Jan 1, 2023
parse and generate XML easily in go

etree The etree package is a lightweight, pure go package that expresses XML in the form of an element tree. Its design was inspired by the Python Ele

Dec 30, 2022