I'm considering between header-only vs. header&source design. I'm not sure if the header&source allows compiler to optimize across object files and across linkage ? such as inlining optimization ?
Header files and source files typically compiled as a single translation unit (since headers are included in the source files). So, that won't be an issue (unless you have a peculiar environment where headers are compiled separately).
GCC does support optimizations across different translation units. See Link Time Optimization.
-flto option's documentation for details:
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC's internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit. To use the link-time optimizer, -flto and optimization options should be specified at compile time and during the final link. It is recommended that you compile all the files participating in the same link with the same options and also specify those options at link time.