BSC Go back
Please note that due to the Sony NDA certain in-depth information about shaders on the PS5 have been omitted.

BSC is a shader compiler that 'transforms' HLSL compute shaders into compute shaders that can run on the Playstation 5. It automatically converts HLSL's built-in struct names, function names and semantics to PS5, with support for the DirectX raytracing (DXR) api.

I'm making this tool for On the bubble, a cross-platform raytraced game for Windows and PS5. This tool is in active development to support new functions, types and to fix bugs.

This tool is used to decrease the workload of the graphics programmers on that project. Allowing them to write a compute shader once in HLSL and not have to worry about rewriting them for the Playstation.

The underlying workings of the compiler are very simple and follow fairly traditional compiler design.
I wrote my own Lexer, Parser and code generator.

The lexer (Short for lexical analyzer) converts the input HLSL source into tokens, These tokens simplify the parsing process.

The parser converts the tokens into an AST (Abstract syntax tree). An AST is a tree data structure that represents the syntax of the language. I wrote a recursive decent parser for BSC.

After the lexing and parsing stage I traverse the syntax tree's to detect usage of any HLSL types, functions, etc. If we detect anything HLSL we 'tag' the syntax tree so that the code generator can generate the correct Playstation 5 shader names.

The code generator traverses the syntax tree to generate an output string with the shader that can run on the Playstation 5. Any syntax tree has been 'tagged' will be generated using the PS5 type & function names.

Using this compiler the graphics programmers that work on On the bubble have successfully implemented a full path tracer that supports wavefront raytracing and modern denoising techniques such as Restir and SVGF in HLSL that runs on the Playstation 5.

Code samples below
To reprent a token I use a simple struct. A token is a bundle of characters representing something in your code. for example: 'return' is a token of type RETURN. '1024' is a token of type INT_VALUE with the attached data of 1024. Tokens also store on which line they are for error reporting.
							
struct Token {
  // The type of a token, eg: Int, Literal, Semicolon, etc
  TokenType m_Type = TOKEN_TYPE_UNDEFINED; 

  // On what line is this token, used for error reporting.
  int m_Line = 0;

  // We use this union to store extra data if the token type is-
  // Int value, Decimal value, Identifier or String literal
  union {
    double m_DecimalValue = 0.0f;
    int m_IntValue;
    const char *m_StringValue;
  };
};	

In the lexer phase we simply scan over an entire HLSL source file and produce a stream of tokens for the parser.
							
							
The parser is a recursive-decent parser.
This section explains how the parsing works. I explain how return statements are parsed because they're simple enough to fit on a website.
							
// Allocate a new AST node using a custom linear allocator.
Ast_return *result = NEW_AST(Ast_return);

// Look at the next token without consuming it. 
// The parser has access to a stream of tokens produced by the lexer.
Token *p = peek(parser, 1);
	 
// If the peeked token is not a semicolon (eg a number) 
// we have an expression as follows: 'return 10;'
if (p->type != TOKEN_TYPE_SEMICOLON) {

	// Consume the return keyword token.
	eat(parser);

	// Parse the expression after the return keyword.
	// In this example the parse_expr function will return an
    // Expr_value which represents the '10'.

	result->return_expr = parse_expr(parser);

	// Expect a semicolon after the expression. 
	// Reports an error if this isnt the case.
	expect(parser, TOKEN_TYPE_SEMICOLON);
}
else {

	// If we end up in this branch we have an a return statement 

	// without a expression, like this: 'return;'

	// Eat the semicolon token.
	eat(parser);
}

return result;
		

							
To translate the HLSL variable declarations to PS5 shader variables declarations I use a simple- hash table lookup. We use the struct type name to lookup the corresponding name for PS5.
Consider the following code as an example:
'RayDesc r;' We use the RayDesc type name to lookup the correct name for PS5.
							

// This following code is from the code that emits variable declarations to a PS5 shader. (example 'int x = 10;')

// If the variable type is an identifier, aka a struct name.
if (varDecl->variable_type == TOKEN_TYPE_IDENTIFIER) {

	// We check if we can find the struct type name in a hash table.
	Struct_description *struct_description = find_struct_description(varDecl->struct_type_name);

	if (struct_description) {

		// If we can find it we set typeName to the playstation name
		typeName = struct_description->playstation_name;
	}
	else {

		// If we can't find it we just use the name procuded by the parser.
		// This is a user-defined struct.
		typeName = varDecl->struct_type_name;
	}
}
else {
	// If the variable type is not an struct. It is a builtin type such as an int, float etc.
	// We just convert that type to a string.
	typeName = variable_type_to_string(varDecl->variable_type);
}							
							
Expression are a bit different to translate from HLSL to PS5. Consider the folllowing HLSL code snippet:
RayDesc r;
float3 x = r.Orgin.x;
On the Playstation 5 we can run into scenarios, Where for example, the 'r.Origin.x' needs to be translated into something like 'r.position.x'; Obviously this is an example and I can not share the actual variable names of the PS5 shader language.

Our parser parses a field expression (r.Origin.x) into a tree that looks like this:

          .
	 / \
	r  .
	  / \
	 Org  X


What I do to detect these HLSL names that need to be translated I walk the tree top-down. We check the left-hand side first, In this case we first encounter the variable 'r', We keep canonical a record of all variables declarations in this scope. So we know that variable 'r' is of type RayDesc. We now know that this expression belongs to the RayDesc struct so we put a tag on the AST so that the converter backend knows that it needs to be translated.
							

// Is the lefthand side of the field expression a value? (example: 'r')
if (field->left->expr_type == EXPR_TYPE_VALUE) {
	Expr_value *value = static_cast(field->left);

	// See if we can find the string value of the expression in the variable record.
	if (record->variables.find(std::string(value->string_value)) != record->variables.end()) {

		Ast_var_decl *decl = record->variables[std::string(value->string_value)];

		// If the variable type is a indentifier, aka of a struct type.
		if (decl->variable_type == TOKEN_TYPE_IDENTIFIER) {

		  // We find the struct type in a lookup hash table.
		  Struct_description *struct_description = find_struct_description(decl->struct_type_name);

          // If it is a HLSL type that can be translated we put a little tag on it with the HLSL struct type name.
		  // so that the converter backend can generate the correct names for this expression.
		  if (struct_description) {
		  	field->builtin_type_reference = decl->struct_type_name;
		  }
		}
	}
}
							
When generating the PS5 shader and we need to emit a field expression we check if we have
a builtin_type_reference set.

Struct_description *struct_description = find_struct_description(field->builtin_type_reference);
if (struct_description) {

	// If we have a type reference, we emit the left side of the expression ('r' in our little example.)
	// The right side of the field expression wil be emitted by a 'fix up' function which 
	// will walk down the field expression further emitting the correct PS5 names.
	result += emit_expr(field->left) + "." + fix_up_field_expr_names(field->right, struct_description);
}

The fix_up_field_expr_names function is quite large covering a large amout of AST types (such as: values, field expression, function calls, paramaters etc)
This is a little example of how normal values are translated. example (r.Origin)

if (expr->expr_type == EXPR_TYPE_VALUE) {
		
	Expr_value *v = static_cast(expr);

	// We check if we can find the right side string value ('Origin' in our example) inside of the 
	// struct description
	Member_description *member = find_member_desc_from_struct_desc(struct_desc, v->string_value);
	if (member) {

		// Emit the PS5 name to the result.
		result += member->playstation_name;	
	}
	else {
		// If not, we have an expression which contains a name that has not been registered yet.
		result += emit_expr(v);
	}
}