Unity Shader中的multi_complie和ShaderFeature(转)

一,multi_complie 还是 shader_feature

shader_feature 和 multi_complie 是两个很相似的预编译指令,在Editor模式下,他们是几乎没有区别的。

共同点是:

  • 声明Keyword,用来产生Shader的变体(Variant)
#pragma multi_compile A B
//OR #pragma shader_feature A B

//-----------------------A模块----------------------
#if A
  return fixed4(1,1,1,1); 
#endif 
//---------------------------------------------------

//-----------------------B模块-----------------------
#if B
  return fixed4(0,0,0,1); 
#endif
//---------------------------------------------------
  • 这个Shader会被编译成两个变体:一是只包含A模块代码的变体A;二是只包含B模块代码的变体B;
  • 指定的第一个关键字是默认生效的,即默认使用变体A;
  • 在脚本里用Material.EnableKeyword或Shader.EnableKeyword来控制运行时具体使用变体A还是变体B;
  • 它们声明的Keyword是全局的,可以对全局的包含该Keyword的不同Shader起作用;
  • 全局最多只能声明256个这样的Keyword;
  • 请注意Keyword的数量和变体的数量之间的关系,并可能由此导致的性能开销,比如声明#pragma multi_compile A B和#pragma multi_compile D E 这样的两组Keyword会产生 2×2=4 个Shader变体,但若声明10组这样的keyword,则该Shader会产生1024个变体;

区别在于:

如果使用shader_feature,build时没有用到的变体会被删除,不会打出来。也就是说,在build以后环境里,运行代码Material.EnableKeyword(“B”)可能不起作用,因为没有Material在使用变体B,所以变体B没有被build出来,运行时也找不到变体B。

如果想解决这个问题,可以采取以下办法中的其中一种:

  1. 使用multi_complie 代替 shader_feature,multi_complie 会把所有变体build出来;
  2. 把这个Shader加入“always included shaders”中 (Project Settings -> Graphic);
  3. 创造一个使用变体B的Material,强行说明变体B有用;

二,__

上文已经提到了,最多声明256个全局Keyword,因此我们要尽量节省Keyword的使用数量。其中一个技巧是使用 __(两条下划线),如:

#pragma multi_compile __ A
//OR #pragma shader_feature __ A

//-----------------------A模块----------------------
#if A
  return fixed4(1,1,1,1); 
#endif 
//---------------------------------------------------

return fixed4(0,0,0,1);
  • 此方式相比#pragma multi_compile A B 的方式,我们可以减少使用一个Keyword。
  • 此方式仍会编译成两个变体:一是不包含A模块代码的变体非A;二是包含A模块代码的变体A;
  • 默认为 __ ,即变体非A生效。

三,multi_complie_local

全局的Keyword只能有256个,这或许会最终对我们造成限制,而且大部分Keyword并不需要进行全局声明。

因此,我们可以使用multi_complie_local来声明局部的、只在该Shader内部起作用的Keyword,用法相似:

#pragma multi_compile_local __ A
//OR #pragma shader_feature_local __ A

//-----------------------A模块----------------------
#if A
  return fixed4(1,1,1,1); 
#endif 
//---------------------------------------------------

return fixed4(0,0,0,1);

但需要注意:

  • local Keyword仍有数量限制,每个Shader最多可以包含64个local Keyword
  • 因为这种Keyword是局部的,Material.EnableKeyword仍是有效的,但对Shader.EnableKeyword或CommandBuffer.EnableShaderKeyword这种全局开关说拜拜
  • 当你既声明了一个全局的Keyword A ,同时又声明了一个同名的、局部的Keyword A,那么优先认为Keyword A是局部的。

原文地址:https://zhuanlan.zhihu.com/p/77043332

有关ShaderVariantsCollection(转)

一、 基础知识 

在写shader时,往往会在shader中定义多个宏,并在shader代码中控制开启宏或关闭宏时物体的渲染过程。最终编译的时候也是根据这些不同的宏来编译生成多种组合形式的shader源码。其中每一种组合就是这个shader的一个变体(Variant)。

Material所包含的Shader Keywords表示启用shader中对应的宏,Unity会调用当前宏组合所对应的变体来为Material进行渲染。
在Editor下,可以通过将material的inspector调成Debug模式来查看当前material定义的Keywords,也可在此模式下直接定义Keywords,用空格分隔Keyword。

在程序中,可用Material.EnableKeyword()、Material.DisableKeyword()、Shader.EnableKeyword()、Shader.DisableKeyword()来启用/禁用相应的宏。Enable函数应与Disable函数相对应。若一个宏由Material.EnableKeyword()开启,则应由Material.DisableKeyword()关闭,Shader.DisableKeyword()无法关闭这个宏。Material中定义的Keywords由Material的函数进行设置。

multi_compile与shader_feature
multi_compile与shader_feature可在shader中定义宏。两者区别如下图所示:

  • 定义方式

定义方式中值得注意的是,#pragma shader_feature A其实是 #pragma shader_feature _ A的简写,下划线表示未定义宏(nokeyword)。因此此时shader其实对应了两个变体,一个是nokeyword,一个是定义了宏A的。

而#pragma multi_compile A并不存在简写这一说,所以shader此时只对应A这个变体。若要表示未定义任何变体,则应写为 #pragma multi_compile __ A。

  • 宏的适用范围

multi_compile定义的宏,如#pragma multi_compile_fog,#pragma multi_compile_fwdbase等,基本上适用于大部分shader,与shader自身所带的属性无关。


shader_feature定义的宏多用于针对shader自身的属性。比如shader中有_NormalMap这个属性(Property),便可通过#pragma shader_feature _NormalMap来定义宏,用来实现这个shader在material有无_NormalMap时可进行不同的处理。

  • 变体的生成

#pragma multi_compile A B C

#pragma multi_compile D E

则此时会生成 A D、A E、B D、B E、C D、C E这6中变体。
shader_feature要生成何种变体可用shader variant collection进行自定义设置。

  • 默认定义的宏

当material中的keywords无法对应shader所生成的变体时,Unity便会默认定义宏定义语句中的首个宏,并运行相应的变体来为这个material进行渲染。
multi_compile与shader_feature都默认定义首个宏,如下表所示:

二、如何控制项目中Shader变体的生成

项目中shader的生成方式主要有三种,其优缺点如下表所示:

而我们希望的结果是在保证渲染效果正确的情况下,要尽可能的控制项目中shader的变体数量,避免产生冗余资源。幸运的是,Unity已经为我们准备好了解决方案:ShaderVariantCollection。

  • ShaderVariantCollection介绍

Shader Variant Collection是用来记录shader中哪些变体是实际使用的。其优点主要有:在shader_feature与multi_compile结合使用时,能够设置生成何种变体,从而避免生成不必要的变体;shader不必和material打在一个包中,避免了多个包中存在相同的变体资源;明确直观的显示了哪些变体是需要生成的。

在Unity中可以通过Create->Shader-> Shader Variant Collection,就可以新建一个shader variant collection文件,shader variant collection 的使用如下图所示:

点击增加变体后,会出现变体选择窗口

配置好需要生成的变体后,将collection与shader打在同一个包中,便能准确生成面板中所配置的shader变体。

  • ShaderVariantCollection生成变体规则

除了在collection中配置的变体会被生成外,Unity还在后台为我们多生成了几个变体,这几个变体是“隐藏的”,并未在collection面板中显示。

1.必定生成首个宏定义开启所对应的变体。

Shader中通过#pragma shader_feature A定义了宏A,并在collection中加入了宏A所对应的变体,如下图所示:

此时生成的变体除了collection中已经存在的ForwardBase A外,还会生成变体ForwardBase nokeyword。因为只定义单个宏时,A 为 _ A的简写。实际上首个被定义的宏为nokeyword,故 nokeyword所对应的变体必定会被生成。
同理,以 #pragma shader_feature A B C来定义宏时,即使collection中未添加变体Forward A,这个变体也必定会被生成(当shader的PassType仅有ForwardBase)。

2.Shader中有多个Pass时变体的生成规则 :

 a. 读取ShaderVariantCollection中已存在的变体,获取它们的Keywords。
b. 将这些Keywords分别与每个Pass的多组Keywords列表求交集,取交集中Keywords数量最多得那组。
c. 用得到的Keywords与对应的PassType生成ShaderVariant,并添加到ShaderVariantCollection中。
d. 若得到得交集中有新的Keywords,则回到b。

上述过程类似递归。例如:
Shader 中有 ForwardBase、ForwardAdd、Normal 三种PassType(以下为了方便简称Base、Add、 Normal)。定义的宏如下:

此时若ShaderVariantCollection中包含的变体是 Base ABC,Add AE。则此时生成的变体为:这三种PassType的默认定义的宏(nokeyword)所对应的变体(3个)以及原先直接包含的Base ABC、Add AE。除此之外Unity还会额外生成Add A、Base A、Normal A、Normal AB、 Base AB、Normal AE这6个变体。

ABC ∩ Add AE -> Add A (A is NewKeyword)
    A ∩ Base ABC -> Base A
    A ∩ Normal ABE -> Normal A
ABC ∩ Normal ABE -> Normal AB (AB is NewKeyword)
    AB ∩ Base ABC -> Base AB
AE ∩ Normal ABE -> Normal AE

  • 变体的调用规则

当collection将变体准确生成后,便能在运行时通过修改material中的keywords来实现对不同变体的调用。
假设某collection生成的变体只有Forward ABC,Forward ABE,Forward nokeyword这三种,则此时调用关系如下:

三、项目中对Shader Variant的管理

  • 项目中变体的添加

那么项目中是如何确定哪些变体是需要加到collection中的呢?我们的做法是:

1.遍历每一个Material,提取其shader keywords。
2.将获得的keywords与shader的每个PassType所包含的宏定义做交集,并将其结果添加到collection中。

举个简单的例子,Material中的Keywords为A B C D,则shader的PassType、PassType中所定义的宏、需要往collection中添加的变体则如下表所示:

需要说明的是,我们自己的代码里为了降低变体生成逻辑的复杂度、保持collection面板上变体的直观性,不将Unity为我们额外生成的那几个变体添加到collection面板中,但要记得Unity是会为我们生成额外的变体的。

  • Shader编写规范

1.建议使用shader_feature时将定义语句写成完整模式,并且不要在一个语句中定义多个宏。
完整模式:#pragma shader_feature _ A,不建议写成#pragma shader_feature A。
不建议在一个语句中定义多个宏,如: #pragma shader_feature _ A B C,若一定要定义多个宏,请务必将其写成完整模式,不使用完整模式在切换shader时可能会与想要的效果不一致,具体原因尚未测得。

2.若在shader中使用shader_feature,请为这个shader指定一个CustomEditor
每个使用shader_feature来定义Keyword的shader都需要再末尾加个 CusomEditor “xxxx”,并在代码中实现类xxxx(需继承自UnityEditor.ShaderGUI),用来对Keywords定义进行设定。
这么做是因为Material中的部分Keyword是由shader中的属性(Properties)所控制的。比如shader中含有_NormalMap的属性并且定义了与_NormalMap相关的Keyword,这个Keyword需要在Material含有NormalMap时添加,不含NormalMap时移除。这个功能可由自定义的CustomEidtor实现。
具体如何写这个CustomEditor类可参考Unity builtin_shaders\Editor\StandardShaderGUI.cs。该文件可去Unity官网下载,下载时选择内置着色器即可。

3. 如果需要在代码中开关宏,请使用multi_compile来定义这个宏,以免变体丢失。

原文链接

Cg Standard Library Functions

Appendix E. Cg Standard Library Functions

Cg provides a set of built-in functions and predefined structures with binding semantics to simplify GPU programming. These functions are similar in spirit to the C standard library, offering a convenient set of common functions. In many cases, the functions map to a single native GPU instruction, so they are executed very quickly. Of the functions that map to multiple native GPU instructions, you may expect the most useful to become more efficient in the near future.

Although you can write your own versions of specific functions for performance or precision reasons, it is generally wiser to use the Cg Standard Library functions when possible. The Standard Library functions will continue to be optimized for future GPUs; a program written today using these functions will automatically be optimized for the latest architectures at compile time. Additionally, the Standard Library provides a convenient unified interface for both vertex and fragment programs.

This appendix describes the contents of the Cg Standard Library, and is divided into the following five sections:

  • “Mathematical Functions”
  • “Geometric Functions”
  • “Texture Map Functions”
  • “Derivative Functions”
  • “Debugging Function”

Where appropriate, functions are overloaded to support scalar and vector variants when the input and output types are the same.

E.1 Mathematical Functions

Table E-1 lists the mathematical functions that the Cg Standard Library provides. The table includes functions useful for trigonometry, exponentiation, rounding, and vector and matrix manipulations, among others. All functions work on scalars and vectors of all sizes, except where noted.

Table E-1. Mathematical Functions

Function Description
abs( x ) Absolute value of x .
acos( x ) Arccosine of x in range [0, p], x in [–1, 1].
all( x ) Returns true if every component of x is not equal to 0.

Returns false otherwise.

any( x ) Returns true if any component of x is not equal to 0.

Returns false otherwise.

asin( x ) Arcsine of x in range [–p/2, p/2]; x should be in [–1, 1].
atan( x ) Arctangent of x in range [–p/2, p/2].
atan2( y , x ) Arctangent of y / x in range [–p, p].
ceil( x ) Smallest integer not less than x .
clamp( x , a , b ) x clamped to the range [ a , b ] as follows:

  • Returns a if x is less than a .
  • Returns b if x is greater than b .
  • Returns x otherwise.
cos( x ) Cosine of x .
cosh( x ) Hyperbolic cosine of x .
cross( A , B ) Cross product of vectors A and B ;

A and B must be three-component vectors.

degrees( x ) Radian-to-degree conversion.
determinant( M ) Determinant of matrix M .
dot( A , B ) Dot product of vectors A and B .
exp( x ) Exponential function e x .
exp2( x ) Exponential function 2 x .
floor( x ) Largest integer not greater than x .
fmod( x , y ) Remainder of x / y , with the same sign as x .

If y is 0, the result is implementation-defined.

frac( x ) Fractional part of x .
frexp( x , out exp ) Splits x into a normalized fraction in the interval [½, 1), which is returned, and a power of 2, which is stored in exp .

If x is 0, both parts of the result are 0.

isfinite( x ) Returns true if x is finite.
isinf( x ) Returns true if x is infinite.
isnan( x ) Returns true if x is NaN (Not a Number).
ldexp( x , n ) x x 2 n .
lerp( a , b , f ) Linear interpolation:

(1 – f )* a + b * f

where a and b are matching vector or scalar types. f can be either a scalar or a vector of the same type as a and b .

lit( NdotL , NdotH , m ) Computes lighting coefficients for ambient, diffuse, and specular light contributions.

Expects the NdotL parameter to contain N  L and the NdotH parameter to contain N  H .

Returns a four-component vector as follows:

  • The x component of the result vector contains the ambient coefficient, which is always 1.0.
  • The y component contains the diffuse coefficient, which is 0 if ( N  L ) < 0; otherwise ( N  L ).
  • The z component contains the specular coefficient, which is 0 if either ( N  L ) < 0 or ( N  H ) < 0; ( N  H ) m otherwise.
  • The w component is 1.0.

There is no vectorized version of this function.

log( x ) Natural logarithm ln( x ) ; x must be greater than 0.
log2( x ) Base 2 logarithm of x ; x must be greater than 0.
log10( x ) Base 10 logarithm of x ; x must be greater than 0.
max( a , b ) Maximum of a and b .
min( a , b ) Minimum of a and b .
modf( x , out ip ) Splits x into integral and fractional parts, each with the same sign as x .

Stores the integral part in ip and returns the fractional part.

mul( M , N ) Matrix product of matrix M and matrix N , as shown below:

304equ01.jpg

If M has size A x B , and N has size B x C , returns a matrix of size A x C .

mul( M , v ) Product of matrix M and column vector v , as shown below:

305equ01.jpg

If M is an A x B matrix and v is a B x 1 vector, returns an A x 1 vector.

mul( v , M ) Product of row vector v and matrix M , as shown below:

305equ02.jpg

If v is a 1 x A vector and M is an A x B matrix, returns a 1 x B vector.

noise( x ) Either a one-, two-, or three-dimensional noise function, depending on the type of its argument. The returned value is between 0 and 1, and is always the same for a given input value.
pow( x , y ) xy .
radians( x ) Degree-to-radian conversion.
round( x ) Closest integer to x .
rsqrt( x ) Reciprocal square root of x ; x must be greater than 0.
saturate( x ) Clamps x to the [0, 1] range.
sign( x ) 1 if x > 0; –1 if x < 0; 0 otherwise.
sin( x ) Sine of x .
sincos(float x , out s , out c ) s is set to the sine of x , and c is set to the cosine of x .

If both sin( x ) and cos( x ) are needed, this function is more efficient than calculating each individually.

sinh( x ) Hyperbolic sine of x .
smoothstep( min , max , x ) For values of x between min and max , returns a smoothly varying value that ranges from 0 at x = min to 1 at x = max .

x is clamped to the range [ min , max ] and then the interpolation formula is evaluated:

–2*(( x  min )/( max  min ))3 +

3*(( x  min )/( max  min ))2

step( a , x ) 0 if x < a ;

1 if x >= a .

sqrt( x ) Square root of x ;

x must be greater than 0.

tan( x ) Tangent of x .
tanh( x ) Hyperbolic tangent of x .
transpose( M ) Matrix transpose of matrix M .

If M is an A x B matrix, the transpose of M is a B x A matrix whose first column is the first row of M , whose second column is the second row of M , whose third column is the third row of M , and so on.

E.2 Geometric Functions

Table E-2 presents the geometric functions that are provided in the Cg Standard Library.

Table E-2. Geometric Functions

Function Description
distance( pt1 , pt2 ) Euclidean distance between points pt1 and pt2 .
faceforward( N , I , Ng ) N if dot( Ng , I ) < 0; - N otherwise.
length( v ) Euclidean length of a vector.
normalize( v ) Returns a vector of length 1 that points in the same direction as vector v .
reflect( I , N ) Computes reflection vector from entering ray direction I and surface normal N .

Valid only for three-component vectors.

refract( I , N , eta ) Given entering ray direction I , surface normal N , and relative index of refraction eta , computes refraction vector.

If the angle between I and N is too large for a given eta , returns (0, 0, 0).

Valid only for three-component vectors.

E.3 Texture Map Functions

Table E-3 presents the texture map functions that are provided in the Cg Standard Library. Currently, these texture functions are fully supported by the ps_2_0 , ps_2_x , arbfp1 , and fp30 profiles (though only OpenGL profiles support the samplerRECT functions). They will also be supported by all future advanced fragment profiles with texture-mapping capabilities. All of the functions listed in Table E-3 return a float4 value.

Table E-3. Texture Map Functions

Function Description
tex1D(sampler1D tex , float s ) 1D nonprojective texture query
tex1D(sampler1D tex , float s , float dsdx , float dsdy ) 1D nonprojective texture query with derivatives
tex1D(sampler1D tex , float2 sz ) 1D nonprojective depth compare texture query
tex1D(sampler1D tex , float2 sz , float dsdx , float dsdy ) 1D nonprojective depth compare texture query with derivatives
tex1Dproj(sampler1D tex , float2 sq ) 1D projective texture query
tex1Dproj(sampler1D tex , float3 szq ) 1D projective depth compare texture query
tex2D(sampler2D tex , float2 s ) 2D nonprojective texture query
tex2D(sampler2D tex , float2 s , float2 dsdx , float2 dsdy ) 2D nonprojective texture query with derivatives
tex2D(sampler2D tex , float3 sz ) 2D nonprojective depth compare texture query
tex2D(sampler2D tex , float3 sz , float2 dsdx , float2 dsdy ) 2D nonprojective depth compare texture query with derivatives
tex2Dproj(sampler2D tex , float3 sq ) 2D projective texture query
tex2Dproj(sampler2D tex , float4 szq ) 2D projective depth compare texture query
texRECT(samplerRECT tex , float2 s ) 2D nonprojective texture rectangle texture query (OpenGL only)
texRECT(samplerRECT tex , float2 s , float2 dsdx , float2 dsdy ) 2D nonprojective texture rectangle texture query with derivatives (OpenGL only)
texRECT(samplerRECT tex , float3 sz ) 2D nonprojective texture rectangle depth compare texture query (OpenGL only)
texRECT(samplerRECT tex , float3 sz , float2 dsdx , float2 dsdy ) 2D nonprojective depth compare texture query with derivatives (OpenGL only)
texRECTproj(samplerRECT tex , float3 sq ) 2D texture rectangle projective texture query (OpenGL only)
texRECTproj(samplerRECT tex , float3 szq ) 2D texture rectangle projective depth compare texture query (OpenGL only)
tex3D(sampler3D tex , float3 s ) 3D nonprojective texture query
tex3D(sampler3D tex , float3 s , float3 dsdx , float3 dsdy ) 3D nonprojective texture query with derivatives
tex3Dproj(sampler3D tex , float4 sq ) 3D projective texture query
texCUBE(samplerCUBE tex , float3 s ) Cube map nonprojective texture query
texCUBE(samplerCUBE tex , float3 s , float3 dsdx , float3 dsdy ) Cube map nonprojective texture query with derivatives
texCUBEproj(samplerCUBE tex , float4 sq ) Cube map projective texture query (ignores q)

Because of the limited pixel programmability of older hardware, the ps_1_1 , ps_1_2 , ps_1_3 , and fp20 profiles have restrictions on the use of texture-mapping functions. See the documentation for these profiles for more information.

In the table, the name of the second argument to each function indicates how its values are used when performing the texture lookup:

  • s indicates a one-, two-, or three-component texture coordinate.
  • z indicates a depth comparison value for shadow map lookups.
  • q indicates a perspective value, and is used to divide the texture coordinate ( s ) before the texture lookup is performed.

When you use the texture functions that allow specifying a depth comparison value, the associated texture unit must be configured for depth-compare texturing. Otherwise, no depth comparison will actually be performed.

E.4 Derivative Functions

Table E-4 presents the derivative functions that are supported by the Cg Standard Library. Vertex profiles do not support these functions.

Table E-4. Derivative Functions

Function Description
ddx( a ) Approximate partial derivative of a with respect to screen-space x coordinate
ddy( a ) Approximate partial derivative of a with respect to screen-space y coordinate

E.5 Debugging Function

Table E-5 presents the debugging function that is supported by the Cg Standard Library. Vertex profiles are not required to support this function.

Table E-5. Debugging Function

Function Description
void debug(float4 x ) If the compiler’s DEBUG option is enabled, calling this function causes the value x to be copied to the COLOR output of the program, and execution of the program is terminated.

If the compiler’s DEBUG option is not enabled, this function does nothing.

The intent of the debug function is to allow a program to be compiled twice—once with the DEBUG option and once without. By executing both programs, it is possible to obtain one frame buffer containing the final output of the program and another frame buffer containing an intermediate value to be examined for debugging purposes.