Awesome
Learning Node.js
The sole purpose of this project is to aid in learning Node.js internals. Please note that viewing this README.md as the main page of the project might truncate it as it has become quite large. By opening the file (clicking on it) you can view the whole content.
Prerequisites
You'll need to have checked out the node.js source.
Compiling Node.js with debug enbled:
$ ./configure --debug
$ make -C out BUILDTYPE=Debug
After compiling (with debugging enabled) start node using lldb:
$ cd node/out/Debug
$ lldb ./node
Node uses Generate Your Projects (gyp) for which I was not familiar with so there is a example project in gyp to look into it.
Running the Node.js tests
$ make -j8 test
Notes
The rest of this page contains notes gathred while setting through the code base: (These are more sections than listed here but they might be hard to follow)
- Background
- Start up
- Loading of builtins
- Environment
- TCPWrap
- Running a script
- Event loop
- setTimeout
- setImmediate
- nextTick
- AsyncWrap
- lldb
- Promises
- Libuv Thread pool
- bootstrap_node.js compilation and execution walkthrough
Background
Node.js is roughly Google V8, libuv and Node.js core which glues everything together.
V8 bascially consists of the memory management of the heap and the execution stack (very simplified but helps make my point). If you are used to web client side development you'll know about the WebAPIs that are also available like DOM, AJAX, setTimeout etc. This functionality is not provided by V8 but in instead by chrome. There is also nothing about a event loop in V8, this is also something that is provided by chrome.
+------------------------------------------------------------------------------------------+
| Google Chrome |
| |
| +----------------------------------------+ +------------------------------+ |
| | Google V8 | | WebAPIs | |
| | +-------------+ +---------------+ | | | |
| | | Heap | | Stack | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | +-------------+ +---------------+ | | | |
| | | | | | |
| +----------------------------------------+ +------------------------------+ |
| |
| |
| +---------------------+ +---------------------------------------+ |
| | Event loop | | Render task queue | |
| | | | | |
| +---------------------+ +---------------------------------------+ |
| +---------------------------------------+ |
| | Callback/task queue | |
| | | |
| +---------------------------------------+ |
| |
| |
+------------------------------------------------------------------------------------------+
The execution stack is a stack of frame pointers. For each function called, that function will be pushed onto the stack. When a function returns it will be removed. If that function calls other functions they will be pushed onto the stack. When all functions have returned execution can proceed from the returned to point. If one of the functions performs an operation that takes time, progress will not be made until it completes as the only way to complete is that the function returns and is popped off the stack. This is what happens when you have a single threaded programming language.
Aychnronous work can be done by calling into the WebAPIs, for example calling setTimeout which will call out to the WebAPI and then return. The functionality for setTimeout is provided by the WebAPI and when the timer is due the WebAPI will push the callback onto the callback queue. Items from the callback queue will be picked up by the event loop and pushed onto the stack for execution.
TODO: Is the microtask queue considered part of V8 or part of chrome?
Now lets compare this with Node.js:
+------------------------------------------------------------------------------------------+
| Node.js |
| |
| +----------------------------------------+ +------------------------------+ |
| | Google V8 | | Node Core APIs | |
| | +-------------+ +---------------+ | | | |
| | | Heap | | Stack | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | +-------------+ +---------------+ | | | |
| | | | | |
| +----------------------------------------+ +------------------------------+ |
| |
| |
| +---------------------+ +---------------------------------------+ |
| | libuv | | Microtask queue | |
| | Event Loop | | | |
| +---------------------+ +---------------------------------------+ |
| +---------------------------------------+ |
| | Callback queue | |
| | | |
| +---------------------------------------+ |
| |
| |
+------------------------------------------------------------------------------------------+
Taking the same example from above, setTimeout
, this would be a call to Node Core API and then
the function will return. When the timer expires Node Core API will push the callback onto the
callback queue.
The event loop in Node is provided by libuv, whereas in chrome this is provided by the browser
(chromium I believe)
TODO: Is the microtask queue considered part of V8 or part of node?
Starting Node
To start and stop at first line in a js program use:
$ lldb -- node --debug-brk test.js
Set a break point in node_main.cc:
(lldb) breakpoint set --file node_main.cc --line 52
(lldb) run
Walkthrough
node_main.cc
will bascially call node::Start which we can find in src/node.cc.
Start(int argc, char** argv)
Starts by calling PlatformInit
default_platform = v8::platform::CreateDefaultPlatform(v8_thread_pool_size);
V8::InitializePlatform(default_platform);
V8::Initialize();
...
Init(&argc, const_cast<const char**>(argv), &exec_argc, &exec_argv);
PlatformInit()
From what I understand this mainly sets up things like signals and file descriptor limits.
Init
Init has some libuv code that looks familiar to what I played around with in learning-libuv.
uv_async_init(uv_default_loop(),
&dispatch_debug_messages_async,
DispatchDebugMessagesAsyncCallback);
Now I've not used uv_async_init
but looking at the docs this is done to allow a different thread to wake up the event loop and have the
callback invoked. uv_async_init looks like this:
int uv_async_init(uv_loop_t* loop, uv_async_t* async, uv_async_cb async_cb)
To understand this better this standalone example helped my clarify things a bit.
uv_unref(reinterpret_cast<uv_handle_t*>(&dispatch_debug_messages_async));
I believe this is done so that the ref count of the dispatch_debug_message_async handle is decremented. If this handle is the only thing referened that would cause the event loop to be considered alive and it will continue to iterate.
So a different thread can use uv_async_sent(&dispatch_debug_messages_async) to to wake up the eventloop and have the DispatchDebugMessagesAsyncCallback function called.
EnableDebugSignalHandler
This is where we signal the semaphore which will increment the counter, and any threads in the wait queue will now run. So our thread that is blocked waiting for this debug_semaphore will be able to proceed and TryStartDebugger will be called.
uv_sem_post(&debug_semaphore);
But what will actually send the signal for all this to happen? I think this is done DebugProcess(const FunctionCallbackInfo<Value>& args). Setting a break point confirmed this and the back trace:
(lldb) bt
* thread #1: tid = 0x11f57b1, 0x0000000100cafddc node`node::DebugProcess(args=0x00007fff5fbf4700) + 12 at node.cc:3754, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100cafddc node`node::DebugProcess(args=0x00007fff5fbf4700) + 12 at node.cc:3754
frame #1: 0x000000010028618b node`v8::internal::FunctionCallbackArguments::Call(this=0x00007fff5fbf4878, f=(node`node::DebugProcess(v8::FunctionCallbackInfo<v8::Value> const&) at node.cc:3753))(v8::FunctionCallbackInfo<v8::Value> const&)) + 139 at arguments.cc:33
frame #2: 0x00000001002f58f3 node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(isolate=0x0000000104004000, args=BuiltinArguments<v8::internal::BuiltinExtraArguments::kTarget> @ 0x00007fff5fbf49d0)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>) + 1619 at builtins.cc:3915
frame #3: 0x000000010031ce36 node`v8::internal::Builtin_Impl_HandleApiCall(args=v8::internal::(anonymous namespace)::HandleApiCallArgumentsType @ 0x00007fff5fbf4a38, isolate=0x0000000104004000)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>, v8::internal::Isolate*) + 86 at builtins.cc:3939
frame #4: 0x00000001002f9c8f node`v8::internal::Builtin_HandleApiCall(args_length=3, args_object=0x00007fff5fbf4b28, isolate=0x0000000104004000) + 143 at builtins.cc:3936
frame #5: 0x00003ca66bf0961b
frame #6: 0x00003ca66c081e0d
frame #7: 0x00003ca66bf0d17a
frame #8: 0x00003ca66c01edb3
frame #9: 0x00003ca66c01e802
frame #10: 0x00003ca66bf0d17a
frame #11: 0x00003ca66c081b25
frame #12: 0x00003ca66c074fd2
frame #13: 0x00003ca66c074c6b
frame #14: 0x00003ca66bf38024
frame #15: 0x00003ca66bf22962
frame #16: 0x00000001006f11df node`v8::internal::(anonymous namespace)::Invoke(isolate=0x0000000104004000, is_construct=false, target=Handle<v8::internal::Object> @ 0x00007fff5fbf4f50, receiver=Handle<v8::internal::Object> @ 0x00007fff5fbf4f48, argc=0, args=0x0000000000000000, new_target=Handle<v8::internal::Object> @ 0x00007fff5fbf4f40) + 607 at execution.cc:97
frame #17: 0x00000001006f0f61 node`v8::internal::Execution::Call(isolate=0x0000000104004000, callable=Handle<v8::internal::Object> @ 0x00007fff5fbf50a8, receiver=Handle<v8::internal::Object> @ 0x00007fff5fbf50a0, argc=0, argv=0x0000000000000000) + 1313 at execution.cc:163
frame #18: 0x000000010023f4af node`v8::Function::Call(this=0x0000000104062c20, context=(val_ = 0x00000001040404c8), recv=(val_ = 0x00000001040628a0), argc=0, argv=0x0000000000000000) + 671 at api.cc:4404
frame #19: 0x000000010023f611 node`v8::Function::Call(this=0x0000000104062c20, recv=(val_ = 0x00000001040628a0), argc=0, argv=0x0000000000000000) + 113 at api.cc:4413
frame #20: 0x0000000100c8f3b8 node`node::AsyncWrap::MakeCallback(this=0x0000000104800d50, cb=(val_ = 0x00000001040404a0), argc=3, argv=0x00007fff5fbf5690) + 2600 at async-wrap.cc:284
frame #21: 0x0000000100c937e6 node`node::AsyncWrap::MakeCallback(this=0x0000000104800d50, symbol=(val_ = 0x000000010403e5b0), argc=3, argv=0x00007fff5fbf5690) + 198 at async-wrap-inl.h:110
frame #22: 0x0000000100d06c67 node`node::StreamBase::EmitData(this=0x0000000104800d50, nread=43, buf=(val_ = 0x0000000104040488), handle=(val_ = 0x0000000000000000)) + 551 at stream_base.cc:427
frame #23: 0x0000000100d0adc3 node`node::StreamWrap::OnReadImpl(nread=43, buf=0x00007fff5fbf58f8, pending=UV_UNKNOWN_HANDLE, ctx=0x0000000104800d50) + 675 at stream_wrap.cc:222
frame #24: 0x0000000100ca25a7 node`node::StreamResource::OnRead(this=0x0000000104800d50, nread=43, buf=0x00007fff5fbf58f8, pending=UV_UNKNOWN_HANDLE) + 119 at stream_base.h:171
frame #25: 0x0000000100d0b93f node`node::StreamWrap::OnReadCommon(handle=0x0000000104800df0, nread=43, buf=0x00007fff5fbf58f8, pending=UV_UNKNOWN_HANDLE) + 351 at stream_wrap.cc:246
frame #26: 0x0000000100d0b3d4 node`node::StreamWrap::OnRead(handle=0x0000000104800df0, nread=43, buf=0x00007fff5fbf58f8) + 116 at stream_wrap.cc:261
frame #27: 0x0000000100f70e93 node`uv__read(stream=0x0000000104800df0) + 1555 at stream.c:1192
frame #28: 0x0000000100f6cb8c node`uv__stream_io(loop=0x00000001019ee200, w=0x0000000104800e78, events=1) + 348 at stream.c:1259
frame #29: 0x0000000100f7b784 node`uv__io_poll(loop=0x00000001019ee200, timeout=7073) + 3492 at kqueue.c:276
frame #30: 0x0000000100f5e62f node`uv_run(loop=0x00000001019ee200, mode=UV_RUN_ONCE) + 207 at core.c:354
frame #31: 0x0000000100cb33a0 node`node::StartNodeInstance(arg=0x00007fff5fbfea60) + 912 at node.cc:4303
frame #32: 0x0000000100cb2f8d node`node::Start(argc=2, argv=0x0000000103404a60) + 253 at node.cc:4380
frame #33: 0x0000000100cede9b node`main(argc=2, argv=0x00007fff5fbfeb18) + 75 at node_main.cc:54
frame #34: 0x0000000100001634 node`start + 52
So to recap, SetupProcessObject sets up the process object for node and one of the methods it sets is '_debugProcess':
env->SetMethod(process, "_debugProcess", DebugProcess);
SetupProcessObject is called from Environment::Start (src/env.cc):
* thread #1: tid = 0x1207377, 0x0000000100cad2fc node`node::SetupProcessObject(env=0x00007fff5fbfe108, argc=2, argv=0x0000000103604a20, exec_argc=0, exec_argv=0x0000000103604410) + 11020 at node.cc:3205, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100cad2fc node`node::SetupProcessObject(env=0x00007fff5fbfe108, argc=2, argv=0x0000000103604a20, exec_argc=0, exec_argv=0x0000000103604410) + 11020 at node.cc:3205
frame #1: 0x0000000100c91bc7 node`node::Environment::Start(this=0x00007fff5fbfe108, argc=2, argv=0x0000000103604a20, exec_argc=0, exec_argv=0x0000000103604410, start_profiler_idle_notifier=false) + 919 at env.cc:91
frame #2: 0x0000000100cb32b1 node`node::StartNodeInstance(arg=0x00007fff5fbfeaa0) + 673 at node.cc:4274
frame #3: 0x0000000100cb2f8d node`node::Start(argc=2, argv=0x0000000103604a20) + 253 at node.cc:4380
frame #4: 0x0000000100cede9b node`main(argc=2, argv=0x00007fff5fbfeb58) + 75 at node_main.cc:54
frame #5: 0x0000000100001634 node`start + 52
After that detour we are back in the Start method, and the next line is:
default_platform = v8::platform::CreateDefaultPlatform(v8_thread_pool_size);
(lldb) p v8_thread_pool_size
(int) $30 = 4
We can find the implementation of this in deps/v8/src/libplatform/default-platform.cc
.
The call is the same as was used in the hello_world example except here the size of the thread pool is being passed in
and in the hello_world the no arguments method was called. I only skimmed this part when I was working through that
example so it might be good to figure out what is going on here.
An instance of DefaultPlatform is created and then its SetThreadPoolSize method is called with v8_thread_pool_size. When
the size is not given it will default to p SysInfo::NumberOfProcessors()
.
Next, EnsureInitialized is called which does a check to see if the instance has already been initilized and if not:
for (int i = 0; i < thread_pool_size_; ++i)
thread_pool_.push_back(new WorkerThread(&queue_));
This will create new workers and threads for them. This call finds its way down into deps/v8/src/base/platform/platform-posix.c and its Thread::Start method:
LockGuard<Mutex> lock_guard(&data_->thread_creation_mutex_);
result = pthread_create(&data_->thread_, &attr, ThreadEntry, this);
We can see this is where the creation and starting of a new thread is done. The first argument is the pthread_t to associate with the function ThreadEntry which is the top level entry point for the new thread. The second argument are additional attributes. The third argument is the function as already mentioned and the fourth parameter is the argument to the function. So we can see that ThreadEntry takes the current instance as an argument (well it takes a void pointer):
static void* ThreadEntry(void* arg) {
Thread* thread = reinterpret_cast<Thread*>(arg);
// We take the lock here to make sure that pthread_create finished first since
// we don't know which thread will run first (the original thread or the new
// one).
{ LockGuard<Mutex> lock_guard(&thread->data()->thread_creation_mutex_); }
SetThreadName(thread->name());
DCHECK(thread->data()->thread_ != kNoThread);
thread->NotifyStartedAndRun();
return NULL;
}
ThreadEntry is using a LockGuard and creates a scope to use the Resource Acquisition Is Initialization (RAII) idiom for a mutex. The scope is very limited but like the comment says is really just trying to aquire the lock, which was the same that was used when creating the thread above. So, lets take a look at thread->NotifyStartedAndRun()
NotifyStartedAndRun
void NotifyStartedAndRun() {
if (start_semaphore_) start_semaphore_->Signal();
Run();
}
LockGuard
So a lock guard is an implementation of Resource Acquisition Is Initialization (RAII) and takes a mutex in its constructor which it then calls lock on. When this instance goes out of scope its descructor will be called and it will call unlock guarenteeing that the mutex will be unlocked even if an exception is thrown. The Mutex class can be found in deps/v8/src/base/platform/mutex.h. On a Unix system the mutex will be of type pthread_mutex_t
We can verify this by inspecting the threads before and after the calls. Before:
(lldb) thread list
Process 4614 stopped
* thread #1: tid = 0xe0d19a, 0x0000000100f80cc1 node`v8::base::Thread::Start(this=0x0000000103206110) + 321 at platform-posix.cc:618, queue = 'com.apple.main-thread', stop reason = step over
thread #2: tid = 0xe0d2f2, 0x00007fff858affae libsystem_kernel.dylib`semaphore_wait_trap + 10
After:
(lldb) thread list
Process 4669 stopped
* thread #1: tid = 0xe0e3a7, 0x0000000100f80cdb node`v8::base::Thread::Start(this=0x0000000103206530) + 347 at platform-posix.cc:619, queue = 'com.apple.main-thread', stop reason = step over
thread #2: tid = 0xe0e46d, 0x00007fff858affae libsystem_kernel.dylib`semaphore_wait_trap + 10
thread #3: tid = 0xe0f230, 0x0000000100f81570 node`v8::base::Thread::data(this=0x0000000103206530) at platform.h:463
What does one of these thread do? Lets set a breakpoint in the ThreadEntry method:
(lldb) breakpoint set --file platform-posix.cc --line 582
NodeInstanceData
In the Start method we can see a block with the creation of a new NodeInstanceData instance:
int exit_code = 1;
{
NodeInstanceData instance_data(NodeInstanceType::MAIN,
uv_default_loop(),
argc,
const_cast<const char**>(argv),
exec_argc,
exec_argv,
use_debug_agent);
StartNodeInstance(&instance_data);
exit_code = instance_data.exit_code();
}
There are two NodeInstanceTypes, MAIN and WORKER. The second argument is the libuv event loop to be used.
StartNodeInstance
We are passing the NodeInstanceData instance we created above. The code in this method is very similar to the code that we used in the hello-world.cc. A new Isolate is created. Remember that an Isolate is an independant copy of the V8 runtime, with its own heap.
Environment* env = CreateEnvironment(isolate, context, instance_data);
CreateEnvironment(isolate, context, instance_data)
Local<FunctionTemplate> process_template = FunctionTemplate::New(isolate);
process_template->SetClassName(FIXED_ONE_BYTE_STRING(isolate, "process"));
...
SetupProcessObject(env, argc, argv, exec_argc, exec_argv);
This looks like the node process
object is being created here.
All the JavaScript built-in objects are provided by the V8 runtime but the process object is not one of them. So here we are doing
the same as in the hello-world example above but naming the object 'process'
auto maybe = process->SetAccessor(env->context(),
env->title_string(),
ProcessTitleGetter,
ProcessTitleSetter,
env->as_external());
CHECK(maybe.FromJust());
Notice that SetAccessor returns an "optional" MayBe type.
READONLY_PROPERTY(process,
"version",
FIXED_ONE_BYTE_STRING(env->isolate(), NODE_VERSION));
The above is adding properties to the 'process' object. The first being version and then:
process.moduleLoadList
process.versions[
http_parser,
node,
v8,
vu,
zlib,
ares,
icu,
modules
]
process.icu_data_dir
process.arch
process.platform
process.release
process.release.name
process.release.lts
process.release.sourceUrl
process.release.headersUrl
process.env
process.pid
process.features
READONLY_PROPERTY(process,
"moduleLoadList",
env->module_load_list_array());
I was not aware of this one but process.moduleLoadList will return an array of modules loaded.
READONLY_PROPERTY(process, "versions", versions);
Next up is process.versions which on my local machine returns:
> process.versions
{ http_parser: '2.5.2',
node: '4.4.3',
v8: '4.5.103.35',
uv: '1.8.0',
zlib: '1.2.8',
ares: '1.10.1-DEV',
icu: '56.1',
modules: '46',
openssl: '1.0.2g' }
After setting up all the object (SetupProcessObject) this methods returns. There is still no sign of the loading of the 'bootstrap_node.js' script. This is done in LoadEnvironment.
LoadEnvironment
Local<String> loaders_name = FIXED_ONE_BYTE_STRING(env->isolate(), "internal/bootstrap/loaders.js");
Local<Function> loaders_bootstrapper = GetBootstrapper(env, LoadersBootstrapperSource(env), loaders_name);
Local<String> node_name = FIXED_ONE_BYTE_STRING(env->isolate(), "internal/bootstrap/node.js");
Local<Function> node_bootstrapper = GetBootstrapper(env, NodeBootstrapperSource(env), node_name);
...
// Create binding loaders
v8::Local<v8::Function> get_binding_fn = env->NewFunctionTemplate(GetBinding)->GetFunction(env->context()).ToLocalChecked();
v8::Local<v8::Function> get_linked_binding_fn = env->NewFunctionTemplate(GetLinkedBinding)->GetFunction(env->context()).ToLocalChecked();
v8::Local<v8::Function> get_internal_binding_fn = env->NewFunctionTemplate(GetInternalBinding)->GetFunction(env->context()).ToLocalChecked();
Local<Value> loaders_bootstrapper_args[] = {
env->process_object(),
get_binding_fn,
get_linked_binding_fn,
get_internal_binding_fn
};
Local<Value> bootstrapped_loaders;
if (!ExecuteBootstrapper(env, loaders_bootstrapper,
arraysize(loaders_bootstrapper_args),
loaders_bootstrapper_args,
&bootstrapped_loaders)) {
return;
}
So ExecuteBootstrapper will call the function in internal/bootstrap/loaders.js
:
(lldb) jlh bootstrapper
I'm not showing the output as it is a little long but you can see the contents of loaders.js. We can see the arguments that the function takes:
- source code: (process, getBinding, getLinkedBinding, getInternalBinding) {
These match the arguments from loaders_bootstrapper_args
above.
Next we have:
// Bootstrap Node.js
Local<Value> bootstrapped_node;
Local<Value> node_bootstrapper_args[] = {
env->process_object(),
bootstrapped_loaders
};
if (!ExecuteBootstrapper(env, node_bootstrapper,
arraysize(node_bootstrapper_args),
node_bootstrapper_args,
&bootstrapped_node)) {
return;
Notice that bootstrapped_loaders
is passed as an argument:
(lldb) jlh bootstrapped_loaders
0x9b2977bc179: [JS_OBJECT_TYPE]
- map: 0x9b269e9e361 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x9b27a204479 <Object map = 0x9b269e822a1>
- elements: 0x9b2e0782251 <FixedArray[0]> [HOLEY_ELEMENTS]
- properties: 0x9b2e0782251 <FixedArray[0]> {
#internalBinding: 0x9b2977b3e31 <JSFunction internalBinding (sfi = 0x9b27a2794d9)> (data field 0)
#NativeModule: 0x9b2977b3369 <JSFunction NativeModule (sfi = 0x9b27a2792f9)> (data field 1)
}
internalBinding
and NativeModule
properties are destructed in the function:
- source code: (process, { internalBinding, NativeModule }) {
So we have GetBinding
, GetLinkedBinding
, and GetInternalBinding
which are all passed to the bootstrap function.
A native module can be created using using one of the following types ('src/node_internals.h'):
enum {
NM_F_BUILTIN = 1 << 0,
NM_F_LINKED = 1 << 1,
NM_F_INTERNAL = 1 << 2,
};
For example, the crypto module (src/node_crypto.cc
) is registered using the macro:
NODE_BUILTIN_MODULE_CONTEXT_AWARE(crypto, node::crypto::Initialize)
#define NODE_BUILTIN_MODULE_CONTEXT_AWARE(modname, regfunc) \
NODE_MODULE_CONTEXT_AWARE_CPP(modname, regfunc, nullptr, NM_F_BUILTIN)
Modules that use NF_F_BUILTIN
:
NODE_BUILTIN_MODULE_CONTEXT_AWARE(inspector, node::inspector::Initialize);
NODE_BUILTIN_MODULE_CONTEXT_AWARE(util, node::util::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tcp_wrap, node::TCPWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(url, node::url::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(udp_wrap, node::UDPWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(inspector, Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(process_wrap, node::ProcessWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(buffer, node::Buffer::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(contextify, node::contextify::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(os, node::os::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(async_wrap, node::AsyncWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(fs_event_wrap, node::FSEventWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(spawn_sync, node::SyncProcessRunner::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(js_stream, node::JSStream::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(pipe_wrap, node::PipeWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tty_wrap, node::TTYWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(crypto, node::crypto::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tls_wrap, node::TLSWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(config, node::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(zlib, node::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(fs, node::fs::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(icu, node::i18n::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(cares_wrap, node::cares_wrap::Initialize)
There is work in progress to make the above internal.
If a module uses NODE_MODULE_CONTEXT_AWARE_INTERNAL
it will use `NM_F_INTERNAL which is used by:
NODE_MODULE_CONTEXT_AWARE_INTERNAL(heap_utils, node::heap::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(types, node::InitializeTypes)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(timers, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(http2, node::http2::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(string_decoder, node::InitializeStringDecoder)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(http_parser, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(performance, node::performance::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(uv, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(messaging, node::worker::InitMessaging)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(trace_events, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(serdes, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(v8, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(stream_pipe, node::InitializeStreamPipe)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(domain, node::domain::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(module_wrap, node::loader::ModuleWrap::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(worker, node::worker::InitWorker)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(symbols, node::symbols::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(signal_wrap, node::SignalWrap::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(stream_wrap, node::LibuvStreamWrap::Initialize)
So how about NM_F_LINKED
?
// - process._linkedBinding(): intended to be used by embedders to add
// additional C++ bindings in their applications. These C++ bindings
// can be created using NODE_MODULE_CONTEXT_AWARE_CPP() with the flag
GetBinding
Will be bound to process.binding
. This function will try to find a builtin module with the passed in module name.
GetLinkedBinding
Will be bound to process._linkedBinding.
GetInternalBinding
Will be bound to process.internalBinding.
lib/internal/bootstrap_node.js
This is the file that is loaded by LoadEnvironment
as "bootstrap_node.js".
I read that this file is actually precompiled, where/how?
This file is referenced in node.gyp and is used with the target node_js2c
. This target calls tools/js2c.py
which is a tool for converting
JavaScript source code into C-Style char arrays. This target will process all the library_files specified in the variables section which
lib/internal/bootstrap_node.js
is one of. The output of this out/Debug/obj/gen/node_javascript.h
, depending on the type of build being performed.
So lib/internal/bootstrap_node.js
will become internal_bootstrap_node_value
in node_javascript.h
.
This is then later included in src/node_javascript.cc
.
We can see the contents of this in lldb using:
(lldb) p internal_bootstrap_node_value
Loading of Node Native JavaScript
The JavaScript source files that are located in the lib directory are not loaded in the normal way the JavaScript sources you provide yourself.
Instead these are converted first into c arrays for faster execution. This is done by a build and more specifically a Python tool called js2c.py
(JavaScript to C).
If we take a look at node_javascript.h we find that it declares two functions:
void DefineJavaScript(Environment* env, v8::Local<v8::Object> target);
v8::Local<v8::String> MainSource(Environment* env);
Main source is what is used to load bootstrap_node.js
and DefineJavaScript
is used in the Binding
function in src/node.js for loading
the natives
modules using the binding
call. For example, in lib/internal/bootstrap_node.js
:
NativeModule._source = process.binding('natives');
Lets set a breakpoint in DefineJavaScript
:
(lldb) br s -n node::DefineJavaScript
CHECK(target->Set(env->context(),
internal_bootstrap_node_key.ToStringChecked(env->isolate()),
internal_bootstrap_node_value.ToStringChecked(env->isolate())).FromJust());
(lldb) jlh internal_bootstrap_node_key.ToStringChecked(env->isolate())
"internal/bootstrap_node"
And the value of this will the contents of boostrap_node.js
. So this will be set as the property on the export object
(the object named target above).
Loading of builtins
I wanted to know how builtins, like tcp_wrap and others are loaded. The loading is done explicitely upon initialization by this call in node.cc:
void Init(int* argc,
const char** argv,
int* exec_argc,
const char*** exec_argv) {
// Initialize prog_start_time to get relative uptime.
prog_start_time = static_cast<double>(uv_now(uv_default_loop()));
// Register built-in modules
RegisterBuiltinModules();
RegisterBuiltinModules
void RegisterBuiltinModules() {
#define V(modname) _register_##modname();
NODE_BUILTIN_MODULES(V)
#undef V
}
The macro NODE_BUILTIN_MODULES
can be found in node_internals.h and list all the modules:
#define NODE_BUILTIN_OPENSSL_MODULES(V) V(crypto) V(tls_wrap)
#define NODE_BUILTIN_ICU_MODULES(V) V(icu)
#define NODE_BUILTIN_STANDARD_MODULES(V) \
V(async_wrap) \
V(buffer) \
V(cares_wrap) \
V(config) \
V(contextify) \
V(domain) \
V(fs) \
V(fs_event_wrap) \
V(http2) \
V(http_parser) \
V(inspector) \
V(js_stream) \
V(module_wrap) \
V(os) \
V(performance) \
V(pipe_wrap) \
V(process_wrap) \
V(serdes) \
V(signal_wrap) \
V(spawn_sync) \
V(stream_wrap) \
V(string_decoder) \
V(tcp_wrap) \
V(timer_wrap) \
V(trace_events) \
V(tty_wrap) \
V(udp_wrap) \
V(url) \
V(util) \
V(uv) \
V(v8) \
V(zlib)
#define NODE_BUILTIN_MODULES(V) \
NODE_BUILTIN_STANDARD_MODULES(V) \
NODE_BUILTIN_OPENSSL_MODULES(V) \
NODE_BUILTIN_ICU_MODULES(V)
So for example, tcp_wrap
would have the following function call generated by the preprocesor:
void RegisterBuiltinModules() {
_register_tcp_wrap();
This will call the _register_tcp_wrap()
function that is generated by the NODE_BUILTIN_MODULE_CONTEXT_AWARE
in tcp_wrap.cc.
Lets take a look at the following line from src/tcp_wrap.cc:
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tcp_wrap, node::TCPWrap::Initialize)
Now, setting a breakpoint on this and printing the thread backtrace gives:
-> 436 NODE_BUILTIN_MODULE_CONTEXT_AWARE(tcp_wrap, node::TCPWrap::Initialize)
(lldb) bt
* thread #1: tid = 0x18d8053, 0x0000000100d1056b node`_register_tcp_wrap() + 11 at tcp_wrap.cc:436, queue = 'com.apple.main-thread', stop reason = breakpoint 5.1
* frame #0: 0x0000000100d1056b node`_register_tcp_wrap() + 11 at tcp_wrap.cc:436
frame #1: 0x00007fff5fc1310b dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 265
frame #2: 0x00007fff5fc13284 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
frame #3: 0x00007fff5fc0f8bd dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 305
frame #4: 0x00007fff5fc0f743 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 127
frame #5: 0x00007fff5fc0f9b3 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 75
frame #6: 0x00007fff5fc020f1 dyld`dyld::initializeMainExecutable() + 208
frame #7: 0x00007fff5fc05d98 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 3596
frame #8: 0x00007fff5fc01276 dyld`dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) + 512
frame #9: 0x00007fff5fc01036 dyld`_dyld_start + 54
First things to note is that NODE_BUILTIN_MODULE_CONTEXT_AWARE
is a macro defined in node.h which takes a modname
and regfunc
argument. This in turn calls another macro function named NODE_MODULE_CONTEXT_AWARE_X
.
This macro is invoked with the following arguments:
NODE_MODULE_CONTEXT_AWARE_X(modname, regfunc, NULL, NM_F_BUILTIN)
We already know that in our case modname is tcp_wrap
and that regfunc
is node::TCPWrap::Initialize.
NODE_MODULE_CONTEXT_AWARE_X
#define NODE_MODULE_CONTEXT_AWARE_CPP(modname, regfunc, priv, flags) \
extern "C" { \
static node::node_module _module = \
{ \
NODE_MODULE_VERSION, \
flags, \
nullptr, \
__FILE__, \
nullptr, \
(node::addon_context_register_func) (regfunc), \
NODE_STRINGIFY(modname), \
priv, \
nullptr \
}; \
void _register_ ## modname() { \
node_module_register(&_module) \
}
}
First, extern "C" means that C linkage should be used and no C++ name mangling should occur. This is saying that everything in the block should have this kind of linkage. With that out of the way we can focus on the contents of the block.
static node::node_module _module = \
{ \
NODE_MODULE_VERSION, \
flags, \
NULL, \
__FILE__, \
NULL, \
(node::addon_context_register_func) (regfunc), \
NODE_STRINGIFY(modname), \
priv, \
NULL \
}; \
We are creating a static variable (it exists for the lifetime of the program, but the name is not visible outside of the block. Remember that we are in tcp_wrap.cc in this walk through so the preprocessor will add a definition of the _module
to that tcp_wrap.
node_module
is a struct in node.h and looks like this:
struct node_module {
int nm_version;
unsigned int nm_flags;
void* nm_dso_handle;
const char* nm_filename;
node::addon_register_func nm_register_func;
node::addon_context_register_func nm_context_register_func;
const char* nm_modname;
void* nm_priv;
struct node_module* nm_link;
};
Environment
To create an Environment we need to have an v8::Isolate instance and also an IsolateData instance:
inline Environment(IsolateData* isolate_data, v8::Local<v8::Context> context);
Such a call can be found during startup:
(lldb) bt
* thread #1: tid = 0x946796, 0x00000001008fa39f node`node::Start(int, char**) + 307 at node.cc:4390, queue = 'com.apple.main-thread', stop reason = step over
* frame #0: 0x00000001008fa39f node`node::Start(int, char**) + 307 at node.cc:4390
frame #1: 0x00000001008fa26c node`node::Start(argc=<unavailable>, argv=0x0000000102a00000) + 205 at node.cc:4503
frame #2: 0x0000000100000b34 node`start + 52
See IsolateData for details about that class and the members that are proxied through via an Environment instance.
An Environment has a number of nested classes:
AsyncHooks
AsyncHooksCallbackScope
DomainFlag
TickInfo
The above nested classes call the DISALLOW_COPY_AND_ASSIGN
macro, for example:
DISALLOW_COPY_AND_ASSIGN(TickInfo);
This macro uses = delete
for the copy and assignment operator functions:
#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
TypeName(const TypeName&) = delete; \
void operator=(const TypeName&) = delete
The last nested class is:
HandleCleanup
Environment also has a number of static methods:
static inline Environment* GetCurrent(v8::Isolate* isolate);
This got me wondering, how can we get an Environment from an Isolate, an Isolate is a V8 thing and an Environment a Node thing?
inline Environment* Environment::GetCurrent(v8::Isolate* isolate) {
return GetCurrent(isolate->GetCurrentContext());
}
So we are going to use the current context to get the Environment pointer, but the context is also a V8 concept, not a node.js concept.
inline Environment* Environment::GetCurrent(v8::Local<v8::Context> context) {
return static_cast<Environment*>(context->GetAlignedPointerFromEmbedderData(kContextEmbedderDataIndex));
}
Alright, now we are getting somewhere. Lets take a closer look at context->GetAlignedPointerFromEmbedderData(kContextEmbedderDataIndex)
.
We have to look at the Environment constructor to see where this is set (env-inl.h):
inline Environment::Environment(IsolateData* isolate_data, v8::Local<v8::Context> context)
...
AssignToContext(context);
So, we can see that AssignToContext
is setting the environment on the passed-in context:
static const int kContextEmbedderDataIndex = 5;
inline void Environment::AssignToContext(v8::Local<v8::Context> context) {
context->SetAlignedPointerInEmbedderData(kContextEmbedderDataIndex, this);
}
So this how the Environment is associated with the context, and this enables us to get the environment for a context above.
The argument to SetAlignedPointerInEmbedderData
is a void pointer so it can be anything you want.
The data is stored in a V8 FixedArray, the kContextEmbedderDataIndex
is the index into this array (I think, still learning here).
TODO: read up on how this FixedArray and alignment works.
There are also static methods to get the Environment using a context.
So an Isolate is like single instance of V8 runtime. A Context is a separate execution context that does not know about other context.
An Environment is a Node.js concept and multiple environments can exist within a single isolate. What I'm trying to figure out is how a AtExit callback can be registered with an environment, and also how to force that callback to be called when that particular environment is about to exit. Currently, this is done with a thread-local, but if there are multiple environments per thread these will overwrite each other.
ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES
These are declared in env.h:
#define ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V) \
V(as_external, v8::External) \
V(async_hooks_destroy_function, v8::Function) \
V(async_hooks_init_function, v8::Function) \
V(async_hooks_post_function, v8::Function) \
V(async_hooks_pre_function, v8::Function) \
V(binding_cache_object, v8::Object) \
V(buffer_constructor_function, v8::Function) \
V(buffer_prototype_object, v8::Object) \
V(context, v8::Context) \
V(domain_array, v8::Array) \
V(domains_stack_array, v8::Array) \
V(fs_stats_constructor_function, v8::Function) \
V(generic_internal_field_template, v8::ObjectTemplate) \
V(jsstream_constructor_template, v8::FunctionTemplate) \
V(module_load_list_array, v8::Array) \
V(pipe_constructor_template, v8::FunctionTemplate) \
V(process_object, v8::Object) \
V(promise_reject_function, v8::Function) \
V(push_values_to_array_function, v8::Function) \
V(script_context_constructor_template, v8::FunctionTemplate) \
V(script_data_constructor_function, v8::Function) \
V(secure_context_constructor_template, v8::FunctionTemplate) \
V(tcp_constructor_template, v8::FunctionTemplate) \
V(tick_callback_function, v8::Function) \
V(tls_wrap_constructor_function, v8::Function) \
V(tls_wrap_constructor_template, v8::FunctionTemplate) \
V(tty_constructor_template, v8::FunctionTemplate) \
V(udp_constructor_function, v8::Function) \
V(write_wrap_constructor_function, v8::Function) \
Notice that V
is passed in enabling different macros to be passed in. This is used
to create setters/getters like this:
#define V(PropertyName, TypeName) \
inline v8::Local<TypeName> PropertyName() const; \
inline void set_ ## PropertyName(v8::Local<TypeName> value);
ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)
#undef V
The field itself is private and defined in env.h:
#define V(PropertyName, TypeName) \
v8::Persistent<TypeName> PropertyName ## _;
ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)
#undef V
The above is defining getters and setter for all the properties in ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES
. Notice the usage of V and that is is passed into the macro.
Lets take a look at one:
V(tcp_constructor_template, v8::FunctionTemplate)
Like before these are only the declarations, the definitions can be found in src/env-inl.h
:
#define V(PropertyName, TypeName) \
inline v8::Local<TypeName> Environment::PropertyName() const { \
return StrongPersistentToLocal(PropertyName ## _); \
} \
inline void Environment::set_ ## PropertyName(v8::Local<TypeName> value) { \
PropertyName ## _.Reset(isolate(), value); \
}
ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)
#undef V
So, in the case of tcp_constructor_template
this would expand to:
inline v8::Local<v8::FunctionTemplate> Environment::tcp_constructor_template() const {
return StrongPersistentToLocal(tcp_constructor_template_);
}
inline void Environment::set_tcp_constructor_template(v8::Local<v8::FunctionTempalate> value) {
tcp_constructor_template_.Reset(isolate(), value);
}
So where is this setter called?
It is called from TCPWrap::Initialize
:
env->set_tcp_constructor_template(t);
And when is TCPWrap::Initialize
called?
From Binding in node.cc
:
...
mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);
Recall (from Loading of builtins
) how a module is registred:
NODE_MODULE_CONTEXT_AWARE_BUILTIN(tcp_wrap, node::TCPWrap::Initialize)
The nm_context_register_func
is node::TCPWrap::Initialize
, which is a static method declared in src/tcp_wrap.h:
static void Initialize(v8::Local<v8::Object> target,
v8::Local<v8::Value> unused,
v8::Local<v8::Context> context);
wrap_data->MakeCallback(env->onconnection_string(), arraysize(argv), argv);
env->onconnection_string()
is a simple getter generated by the preprocessor by a macro in env-inl.h
.
TCPWrap::Initialize
First thing that happens is that the Environment is retreived using the current context.
Next, a function template is created:
Local<FunctionTemplate> t = env->NewFunctionTemplate(New);
Just to be clear New
is the address of the function and we are just passing that to the NewFunctionTemplate
method. It will use that address when creating a NewFunctionTemplate.
TcpWrap::New
This class is called TcpWrap because is wraps a libuv uv_tcp_t handle.
static void SetNoDelay(const v8::FunctionCallbackInfo<v8::Value>& args);
static void SetKeepAlive(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Bind(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Bind6(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Listen(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Connect(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Connect6(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Open(const v8::FunctionCallbackInfo<v8::Value>& args);
Each of the functions above, for example SetNoDelay
will all wrap a function call in libuv:
void TCPWrap::SetNoDelay(const FunctionCallbackInfo<Value>& args) {
TCPWrap* wrap;
ASSIGN_OR_RETURN_UNWRAP(&wrap,
args.Holder(),
args.GetReturnValue().Set(UV_EBADF));
int enable = static_cast<int>(args[0]->BooleanValue());
int err = uv_tcp_nodelay(&wrap->handle_, enable);
args.GetReturnValue().Set(err);
}
When a new instance of this class is created it will initialize the handle which must be of type uv_tcp_t:
int r = uv_tcp_init(env->event_loop(), &handle_);
Now, a uv_tcp_t could be used for accepting connection but also for connecting to sockets.
When this is used in JavaScript is would look like this:
var TCP = process.binding('tcp_wrap').TCP;
var handle = new TCP();
When the second line is executed the callback New
will be invoked. This is set up by this line later in TCPWrap::Initialize:
target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "TCP"), t->GetFunction());
New takes a single argument of type v8::FunctionCallbackInfo which holds information about the function call make.
These are things like the number of arguments used, the arguments can be retreived using with the operator[]. New
looks like this:
void TCPWrap::New(const FunctionCallbackInfo<Value>& a) {
CHECK(a.IsConstructCall());
Environment* env = Environment::GetCurrent(a);
TCPWrap* wrap;
if (a.Length() == 0) {
wrap = new TCPWrap(env, a.This(), nullptr);
} else if (a[0]->IsExternal()) {
void* ptr = a[0].As<External>()->Value();
wrap = new TCPWrap(env, a.This(), static_cast<AsyncWrap*>(ptr));
} else {
UNREACHABLE();
}
CHECK(wrap);
}
Like mentioned above when the constructor of TCPWrap is called it will initialize the uv_tcp_t handle.
Using the example above we can see that Length
should be 0 as we did not pass any arguments to the TCP function. Just wondering, what could be passed as a parameter?
What ever it might look like it should be a pointer to an AsyncWrap.
So this is where the instance of TCPWrap is created. Notice a.This()
which is passed all the way up to BaseObject's constructor and made into a persistent handle.
const req = new TCPConnectWrap();
const err = client.connect(req, '127.0.0.1', this.address().port);
Now, new TcpConnectWrap() is setup in TCPWrap::Initalize and the only thing that happens here is that it configured with a constructor that checks that this function is called with the new
keyword. So there is really nothing else happening at this stage. But, when we call client.connect
something interesting does happen:
TCPWrap::Connect
if (err == 0) {
ConnectWrap* req_wrap = new ConnectWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_TCPCONNECTWRAP);
err = uv_tcp_connect(req_wrap->req(),
&wrap->handle_,
reinterpret_cast<const sockaddr*>(&addr),
AfterConnect);
req_wrap->Dispatched();
if (err)
delete req_wrap;
}
So we can see that we are creating a new ConnectWrap instance which extends AsyncWrap and also ReqWrap. Thinking about this makes sense I think. If we recall that the classes with Wrap in them wrap libuv concepts, and in this case we are going to make a tcp connection. If we look at our client example we can see that we are using uv_connect_t make the connection (named connection_req
):
r = uv_tcp_connect(&connect_req,
&tcp_client,
(const struct sockaddr*) &addr,
connect_cb);
tcp_client
in the above example is of type uv_tcp_t
.
But ConnectWrap also extend AsyncWrap. See the AsyncWrap section for details.
What might be of interest and something to look into a little deeper is that ReqWrap will add the request wrap (wrapping a uv_req_t remember) to the current env req_wrap_queue. Keep in mind that a reqest is shortlived.
The last thing that the ConnectWrap constructor does is call Wrap:
Wrap(req_wrap_obj, this);
Now, you might not remember what this req_wrap_obj
is, but it was the first argument to client.connect
and was the new TCPConnectWrap
instance. But this was nothing more than a constructor and nothing else:
(lldb) p req_wrap_obj
(v8::Local<v8::Object>) $34 = (val_ = 0x00007fff5fbfd018)
(lldb) p *(*req_wrap_obj)
(v8::Object) $35 = {}
We can see that this is a v8::Localv8::Object and we are going to store the ConnectWrap instance in this object:
req_wrap_obj->SetAlignedPointerInInternalField(0, this);
So why is this being done?
Well if you take a look in AfterConnect you can see that this will be accessed and passed as a parameter to the oncomplete function:
ConnectWrap* req_wrap = static_cast<ConnectWrap*>(req->data);
...
Local<Value> argv[5] = {
Integer::New(env->isolate(), status),
wrap->object(),
req_wrap->object(),
Boolean::New(env->isolate(), readable),
Boolean::New(env->isolate(), writable)
};
req_wrap->MakeCallback(env->oncomplete_string(), arraysize(argv), argv);
This will then invoke the oncomplete
callback set up on the req
object:
req.oncomplete = function(status, client_, req_, readable, writable) {
}
NewFunctionTemplate
NewFunctionTemplate in env.h specifies a default value for the second parameter `v8::Localv8::Signature() so it does not have to be specified.
v8::Local<v8::External> external = as_external();
return v8::FunctionTemplate::New(isolate(), callback, external, signature);
(lldb) p callback
(v8::FunctionCallback) $0 = 0x0000000100db8540 (node`node::TCPWrap::New(v8::FunctionCallbackInfo<v8::Value> const&) at tcp_wrap.cc:107)
So t
is a function template, a blueprint for a single function. You create an instance of the template by calling GetFunction
. Recall that in JavaScript to create a new type of object you use a function. When this function is used as a constructor, using new, the returned object will be an instance of the InstanceTemplate (ObjectTemplate) that will be discussed shortly.
t->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "TCP"));
The class name is is used for printing objects created with the function created from the FunctionTemplate as its constructor.
t->InstanceTemplate()->SetInternalFieldCount(1);
InstanceTemplate
returns the ObjectTemplate associated with the FunctionTemplate. Every FunctionTemplate has one. Like mentioned before this is the object that is returned after having used the FunctionTemplate as a constructor.
SetInternalFieldCount(1)
instructs V8 to allocate internal storage for every instance created using this template. Anything can be stored that space allocated, and
for node this is often done using SetAlignedPointerInInternalField
. This could then be retrieved using GetAlignedPointerFromInternalField
.
Next, the ObjectTemplate is set up. First a number of properties are configured:
t->InstanceTemplate()->Set(String::NewFromUtf8(env->isolate(), "reading"),
Boolean::New(env->isolate(), false));
Then, a number of prototype methods are set:
env->SetProtoMethod(t, "close", HandleWrap::Close);
Alright, lets take a look at this SetProtoMethod
method in Environment:
inline void Environment::SetProtoMethod(v8::Local<v8::FunctionTemplate> that,
const char* name,
v8::FunctionCallback callback) {
v8::Local<v8::Signature> signature = v8::Signature::New(isolate(), that);
v8::Local<v8::FunctionTemplate> t = NewFunctionTemplate(callback, signature);
// kInternalized strings are created in the old space.
const v8::NewStringType type = v8::NewStringType::kInternalized;
v8::Local<v8::String> name_string =
v8::String::NewFromUtf8(isolate(), name, type).ToLocalChecked();
that->PrototypeTemplate()->Set(name_string, t);
t->SetClassName(name_string); // NODE_SET_PROTOTYPE_METHOD() compatibility.
}
A Signature
has the following class documentation: "A Signature specifies which receiver is valid for a function.". So the receiver is set to be that
which is t
, our newly created FunctionTemplate.
Next, we are creating a FunctionTemplate for the call back HandleWrap::Close
with the signature just created.
Then, we will set the function template as a PrototypeTemplate. Again we see t->SetClassName
which I believe is for when this is printed.
There are few more prototype methods that use HandleWrap callbacks:
env->SetProtoMethod(t, "ref", HandleWrap::Ref);
env->SetProtoMethod(t, "unref", HandleWrap::Unref);
env->SetProtoMethod(t, "hasRef", HandleWrap::HasRef);
So have have a class called HandleWrap
, which I think requires a section of its own.
After this we find the following line:
StreamWrap::AddMethods(env, t, StreamBase::kFlagHasWritev);
This method is defined in stream_wrap.cc:
env->SetProtoMethod(target, "setBlocking", SetBlocking);
StreamBase::AddMethods<StreamWrap>(env, target, flags);
I've been wondering about the class names that end with Wrap and what they are wrapping. My thinking now is that they are wrapping libuv things. For instance, take StreamWrap, in libuv src/unix/stream.c which is what SetBlocking calls:
void StreamWrap::SetBlocking(const FunctionCallbackInfo<Value>& args) {
StreamWrap* wrap;
ASSIGN_OR_RETURN_UNWRAP(&wrap, args.Holder());
CHECK_GT(args.Length(), 0);
if (!wrap->IsAlive())
return args.GetReturnValue().Set(UV_EINVAL);
bool enable = args[0]->IsTrue();
args.GetReturnValue().Set(uv_stream_set_blocking(wrap->stream(), enable));
}
Lets take a look at ASSIGN_OR_RETURN_UNWRAP
:
#define ASSIGN_OR_RETURN_UNWRAP(ptr, obj, ...) \
do { \
*ptr = \
Unwrap<typename node::remove_reference<decltype(**ptr)>::type>(obj); \
if (*ptr == nullptr) \
return __VA_ARGS__; \
} while (0)
So what would this look like after the preprocessor has processed it (need to double check this):
do {
*wrap = Unwrap<uv_stream_t>(obj);
if (*wrap == nullptr)
return;
} while (0);
What does __VA_ARGS__
do?
I've seen this before with variadic methods in c, but not sure what it means to return it. Turns out that if you don't pass anything apart from the required arguments then the return __VA_ARGS_ statement will just be
return;`. There are other places when the usage of this macro does pass additional arguments, for example:
ASSIGN_OR_RETURN_UNWRAP(&wrap,
args.Holder(),
args.GetReturnValue().Set(UV_EBADF));
do {
*wrap = Unwrap<uv_stream_t>(obj);
if (*wrap == nullptr)
return args.GetReturnValue.Set(UV_EBADF);
} while (0);
So we will be returning early with a BADF (bad file descriptor) error.
BaseObject
inline BaseObject(Environment* env, v8::Local<v8::Object> handle);
I'm thinking that the handle is the Node representation of a libuv handle.
inline v8::Persistent<v8::Object>& persistent();
A persistent handle lives on the heap just like a local handle but it does not correspond to C++ scopes. You have to explicitly call Persistent::Reset.
ReqWrap
class ReqWrap : public AsyncWrap
cares_wrap.cc has a subclass named GetAddrInfoReqWrap
node_file.cc has a subclass named FSReqWrap
stream_base.cc has a subclass name ShutDownWrap
stream_base.cc has a subclass name WriteWrap
udp_wrap.cc has a subclass named SendWrap
connect_wrap.cc has a subclass named 'ConnectWrap` which is subclassed by PipeWrap and TCPWrap.
AsyncWrap
Some background about AsyncWrap can be found here So using AsyncWrap we can have callbacks invoked during the life of handle objects. A handle object would for example be a TCPWrap which extends ConnectionWrap -> StreamWrap -> HandleWrap.
Being a builtin module it follows the same initialization as others. So lets take a look at the initialization function and see what kind of functions are made available from JavaScript:
env->SetMethod(target, "setupHooks", SetupHooks); env->SetMethod(target, "pushAsyncIds", PushAsyncIds); env->SetMethod(target, "popAsyncIds", PopAsyncIds); env->SetMethod(target, "queueDestroyAsyncId", QueueDestroyAsyncId); env->SetMethod(target, "enablePromiseHook", EnablePromiseHook); env->SetMethod(target, "disablePromiseHook", DisablePromiseHook); env->SetMethod(target, "registerDestroyHook", RegisterDestroyHook);
You can confirm this by using:
$ ./node --expose-internals -p "require('internal/test/binding').internalBinding('async_wrap')"
env->set_async_hooks_init_function(init_v.As<Function>());
So, if you are like me you might have gone searching for this set_async_hooks_init_function
and not finding it. You
might recall this coming up before. So every environment will have such setters and getters for
V(async_hooks_destroy_function, v8::Function)
V(async_hooks_init_function, v8::Function)
V(async_hooks_post_function, v8::Function)
V(async_hooks_pre_function, v8::Function)
So, we are setting a field named async_hooks_init_function_ in the current env. An example of this usage might be:
const asyncWrap = process.binding('async_wrap');
let asyncObject = {
init: function(uid, provider, parentUid, parentHandle) {
process._rawDebug('init uid:', uid, ', provider:', provider);
},
pre: function(uid) {
process._rawDebug('pre uid:', uid);
},
post: function(uid, didThrow) {
process._rawDebug('post. uid:', uid, 'didThrow:', didThrow);
},
destroy: function(uid) {
process._rawDebug('destroy: uid:', uid);
}
};
There is also a module named async_hoooks
in lib/async_hook.js that can be used:
var ah = require('async_hooks');
let asyncObject = {
init: function(uid, provider, parentUid, parentHandle) {
process._rawDebug('init uid:', uid, ', provider:', provider);
},
before: function(uid) {
process._rawDebug('pre uid:', uid);
},
after: function(uid, didThrow) {
process._rawDebug('post. uid:', uid, 'didThrow:', didThrow);
},
destroy: function(uid) {
process._rawDebug('destroy: uid:', uid);
}
};
let asynchook = ah.createHook(asyncObject);
Now, we can create a break point and see when SetupHooks
is called.
(lldb) br s -n node::SetupHooks
When the startup function in bootstrap_node.js is executed it will call the function setupProcessFatal
.
function setupProcessFatal() {
const {
executionAsyncId,
clearDefaultTriggerAsyncId,
clearAsyncIdStack,
hasAsyncIdStack,
afterHooksExist,
emitAfter
} = NativeModule.require('internal/async_hooks');
...
This will load internal/async_hooks
module which will call:
const async_wrap = process.binding('async_wrap');
...
async_wrap.setupHooks({ init: emitInitNative,
before: emitBeforeNative,
after: emitAfterNative,
destroy: emitDestroyNative,
promise_resolve: emitPromiseResolveNative });
So this is the first time that setupHooks is called.
In SetupHooks
we have the following macro:
#define SET_HOOK_FN(name) \
Local<Value> name##_v = fn_obj->Get( \
env->context(), \
FIXED_ONE_BYTE_STRING(env->isolate(), #name)).ToLocalChecked(); \
CHECK(name##_v->IsFunction()); \
env->set_async_hooks_##name##_function(name##_v.As<Function>());
SET_HOOK_FN(init);
SET_HOOK_FN(before);
SET_HOOK_FN(after);
SET_HOOK_FN(destroy);
SET_HOOK_FN(promise_resolve);
#undef SET_HOOK_FN
Lets expand it for the init
function:
Local<Value> init_v = fn_obj->Get(env->context(), FIXED_ONE_BYTE_STRING(env->isolate(), "init")).ToLocalChecked();
CHECK(init_v->IsFunction());
env->set_async_hooks_init_function(init_v.As<Function>());
After these function have been set we also have the following code in SetupHooks
:
Local<FunctionTemplate> ctor = FunctionTemplate::New(env->isolate());
ctor->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "PromiseWrap"));
Local<ObjectTemplate> promise_wrap_template = ctor->InstanceTemplate();
promise_wrap_template->SetInternalFieldCount(PromiseWrap::kInternalFieldCount); // kInternalFieldCount = 3
promise_wrap_template->SetAccessor(FIXED_ONE_BYTE_STRING(env->isolate(), "promise"), PromiseWrap::GetPromise);
promise_wrap_template->SetAccessor(FIXED_ONE_BYTE_STRING(env->isolate(), "isChainedPromise"), PromiseWrap::getIsChainedPromise);
env->set_promise_wrap_template(promise_wrap_template);
PromiseWrap
is a class defined in async_wrap.cc. So we are setting up a constructor template for a PromiseWrap on the environment.
(lldb) expr env->async_hooks_init_function()
(v8::Local<v8::Function>) $66 = (val_ = 0x00000001060013c0)
(lldb) jlh env->async_hooks_init_function()
0x329732782401: [Function]
- map = 0x329757882521 [FastProperties]
- prototype = 0x3297917043d1
- elements = 0x3297cd382251 <FixedArray[0]> [HOLEY_ELEMENTS]
- function prototype =
- initial_map =
- shared_info = 0x32979b184ce1 <SharedFunctionInfo emitInitNative>
- name = 0x32979b1840f9 <String[14]: emitInitNative>
- formal_parameter_count = 4
- kind = [ NormalFunction ]
- context = 0x329732782241 <FixedArray[38]>
- code = 0x19146b71b241 <Code BUILTIN>
- source code = (asyncId, type, triggerAsyncId, resource) {
active_hooks.call_depth += 1;
// Use a single try/catch for all hook to avoid setting up one per iteration.
try {
for (var i = 0; i < active_hooks.array.length; i++) {
if (typeof active_hooks.array[i][init_symbol] === 'function') {
active_hooks.array[i][init_symbol](
asyncId, type, triggerAsyncId,
resource
);
}
}
} catch (e) {
fatalError(e);
} finally {
active_hooks.call_depth -= 1;
}
// Hooks can only be restored if there have been no recursive hook calls.
// Also the active hooks do not need to be restored if enable()/disable()
// weren't called during hook execution, in which case active_hooks.tmp_array
// will be null.
if (active_hooks.call_depth === 0 && active_hooks.tmp_array !== null) {
restoreActiveHooks();
}
}
- properties = 0x3297cd382251 <FixedArray[0]> {
#length: 0x3297cd3b8c81 <AccessorInfo> (const accessor descriptor)
#name: 0x3297cd3b8c11 <AccessorInfo> (const accessor descriptor)
#prototype: 0x3297cd3b8cf1 <AccessorInfo> (const accessor descriptor)
}
- feedback vector: not available
So lets take a closer look at emitInitNative
in 'lib/internal/async_hooks.js':
active_hooks.call_depth += 1;
try {
for (var i = 0; i < active_hooks.array.length; i++) {
if (typeof active_hooks.array[i][init_symbol] === 'function') {
active_hooks.array[i][init_symbol]( asyncId, type, triggerAsyncId, resource);
}
}
} catch (e) {
fatalError(e);
} finally {
active_hooks.call_depth -= 1;
}
When will env->async_hooks_init_function()
be called?
Each Environment has an AsyncHook as a member (src/env.h):
AsyncHooks async_hooks_;
AsyncHook is a nested class of Environment.
Lets take a look at tcp_wrap.h. TCPWrap extends ConnectionWrap, which extends LibuvStreamWrap, which extends HandleWrap, which extends AsyncWrap.
listen
will eventually call:
handle = new TCP(TCPConstants.SERVER);
This will call void TCPWrap::New
:
...
ProviderType provider;
switch (type) {
case SOCKET:
provider = PROVIDER_TCPWRAP;
break;
case SERVER:
provider = PROVIDER_TCPSERVERWRAP;
break;
default:
UNREACHABLE();
}
new TCPWrap(env, args.This(), provider);
This constructor call will delegate up to the BaseObject class's constructor and then continue with AsyncWrap's constructor:
#define NODE_ASYNC_ID_OFFSET 0xA1C
...
// Shift provider value over to prevent id collision.
persistent().SetWrapperClassId(NODE_ASYNC_ID_OFFSET + provider);
Lets take a look SetWrapperClassId more closely. e following will use
NODE_ASYNC_ID_OFFSET
is 2588 in decimal.
(lldb) expr provider_type_
(const node::AsyncWrap::ProviderType) $3 = PROVIDER_PROMISE
(lldb) expr 2588 + provider_type_
(int) $4 = 2607
persistent()
is a member function of BaseObject which returns the handle for this AsyncWrap instance.
SetWrapperClassId
is a member function in PersistentBase<T>
:
internal::Object** obj = reinterpret_cast<internal::Object**>(this->val_);
uint8_t* addr = reinterpret_cast<uint8_t*>(obj) + I::kNodeClassIdOffset
obj
is a pointer to a pointer which I find hard to visualize but here is an attempt:
(lldb) expr this->val_
(v8::Object *) $76 = 0x000000010604b680
(lldb) expr obj
(v8::internal::Object **) $75 = 0x000000010604b680
lldb) expr &this->val_
(v8::Object **) $84 = 0x0000000104507a28
(lldb) expr &obj
(v8::internal::Object ***) $85 = 0x00007fff5fbfc898
So this->val_
is a pointer to v8::Object and we are using reinterpret_cast to instruct the compiler to treat it
like it was of type v8::internal::Object**.
&this->val
0x0000000104507a28 -----------------------> 0x000000010604b680 -------------> v8::Object
|
|
&obj |
0x00007fff5fbfc898 -------------------------------+
Next, we are going to interpret the obj as an uint8_t* type so that we can perform the addition:
uint8_t* addr = reinterpret_cast<uint8_t*>(obj) + I::kNodeClassIdOffset
*reinterpret_cast<uint16_t*>(addr) = class_id;
And then we set the value that addr is pointing to, to the passed in class it. But that does not really explain why this is being done. Lets stick a break point in the PromiseWrap constructor:
(lldb) br s -f async_wrap.cc -l 611
Now, lets try casting obj
to
(lldb) expr *reinterpret_cast<v8::internal::GlobalHandles::Node*>(obj)
(v8::internal::GlobalHandles::Node) $96 = {
object_ = 0x000033d8d5325b49
class_id_ = 0
index_ = 'D'
flags_ = 'a'
weak_callback_ = 0x0000000000000000
parameter_or_next_free_ = {
parameter = 0x0000000000000000
next_free = 0x0000000000000000
}
}
v8::internal::GlobalHandles::Node
can be found in deps/v8/src/global-handles.cc
and we can see that it has
the following private members:
Object* object_;
// Wrapper class ID.
uint16_t class_id_;
// Index in the containing handle block.
uint8_t index_;
Notice the order. This is what addr is pointing to in :
uint8_t* addr = reinterpret_cast<uint8_t*>(obj) + I::kNodeClassIdOffset
*reinterpret_cast<uint16_t*>(addr) = class_id;
After having set the value we can again inspect the value:
(lldb) expr *reinterpret_cast<v8::internal::GlobalHandles::Node*>(obj)
(v8::internal::GlobalHandles::Node) $107 = {
object_ = 0x000033d8d5325b49
class_id_ = 2607
index_ = 'D'
flags_ = 'a'
weak_callback_ = 0x0000000000000000
parameter_or_next_free_ = {
parameter = 0x0000000000000000
next_free = 0x0000000000000000
}
}
TODO: Figure out how the connection between the Node and the stored object actually works.
So lets take a step back and see how things all fits together.
(lldb) br s -f async_wrap.cc -l 264
(lldb) br s -f v8.h -l 9082
When we create a new PromiseWrap it will delegate calling the above constructors, the last on is the BaseObject constructor it will invoke .
template <class T>
T* PersistentBase<T>::New(Isolate* isolate, T* that) {
if (that == NULL) return NULL;
internal::Object** p = reinterpret_cast<internal::Object**>(that);
return reinterpret_cast<T*>(V8::GlobalizeReference(reinterpret_cast<internal::Isolate*>(isolate), p));
}
GlobalizeReference
will then do the following:
i::Object** V8::GlobalizeReference(i::Isolate* isolate, i::Object** obj) {
LOG_API(isolate, Persistent, New);
i::Handle<i::Object> result = isolate->global_handles()->Create(*obj);
return result.location();
}
GlobalHandles
can be found in deps/v8/src/global-handles.h
. A global handle is alive until it's Destroy
function is called (so it is not cleared when it does out of scope like a local handle with a HandleScope).
(lldb) expr *this
(v8::internal::GlobalHandles) $126 = {
isolate_ = 0x0000000105807800
number_of_global_handles_ = 1
first_block_ = 0x0000000104801800
first_used_block_ = 0x0000000104801800
first_free_ = 0x0000000104801820
new_space_nodes_ = size=1 {
[0] = 0x0000000104801800
}
post_gc_processing_count_ = 0
number_of_phantom_handle_resets_ = 0
pending_phantom_callbacks_ = size=0 {}
}
GlobalHandles class has a number of private members as we can see above:
Isolate* isolate_;
int number_of_global_handles_;
NodeBlock* first_block_;
NodeBlock* first_used_block_;
Node* first_free_;
std::vector<Node*> new_space_nodes_;
int post_gc_processing_count_;
size_t number_of_phantom_handle_resets_;
std::vector<PendingPhantomCallback> pending_phantom_callbacks_;
So lets take a closer look at Create
and see what it does.
Node* result = first_free_;
result->Acquire(value);
Acquire
performs the following on the passed on Object:
DCHECK(state() == FREE);
object_ = object;
class_id_ = v8::HeapProfiler::kPersistentHandleNoClassId;
set_active(false);
set_state(NORMAL);
parameter_or_next_free_.parameter = nullptr;
weak_callback_ = nullptr;
IncreaseBlockUses();
Now, Aquire is a member function on GlobalHandles::Node and we can see that the object pointer is set, and the class_id_ (and others but I'm focusing on these two).
Create then returns:
return result->handle();
Which does:
Handle<Object> handle() { return Handle<Object>(location()); }
v8::Object does not have any members and an Object can be either a Smi or a HeapObject.
AsyncReset()
is called from AsyncWrap's constructor:
// Use AsyncReset() call to execute the init() callbacks.
AsyncReset(execution_async_id);
Note that AsyncWrap has a default value for execution_async_id
in async_wrap.h:
double execution_async_id = -1
Which is the value of execution_async_id in this call:
(lldb) expr execution_async_id
(double) $25 = -1
Also not the AsyncReset
has default values defined for its parameters:
void AsyncReset(double execution_async_id = -1, bool silent = false);
So lets take a look at AsyncReset
:
async_id_ = execution_async_id == -1 ? env()->new_async_id() : execution_async_id;
In our case we will get a new async_id (double) from the environment instance:
inline double Environment::new_async_id() {
async_hooks()->async_id_fields()[AsyncHooks::kAsyncIdCounter] += 1;
return async_hooks()->async_id_fields()[AsyncHooks::kAsyncIdCounter];
}
lldb) expr *async_hooks()
(node::Environment::AsyncHooks) $28 = {
...
async_id_fields_ = {
isolate_ = 0x0000000105007400
count_ = 4
byte_offset_ = 0
buffer_ = 0x0000000104a1a2e0
js_array_ = {
v8::PersistentBase<v8::Float64Array> = (val_ = 0x000000010505c840)
}
free_buffer_ = true
}
async_id_fields_
is defined in src/env.h:
AliasedBuffer<double, v8::Float64Array> async_id_fields_;
The types of fields are:
enum UidFields {
kExecutionAsyncId,
kTriggerAsyncId,
kAsyncIdCounter,
kDefaultTriggerAsyncId,
kUidFieldsCount,
};
So if we look at the above call again:
async_hooks()->async_id_fields()[AsyncHooks::kAsyncIdCounter] += 1;
So we obtaining an id for this AsyncWrap resource which remember is of type TCPWrap. Next we have:
trigger_async_id_ = env()->get_default_trigger_async_id();
The trigger id is an id for what triggered. Which we can find in src/env-inl.h:
inline double Environment::get_default_trigger_async_id() {
double default_trigger_async_id = async_hooks()->async_id_fields()[AsyncHooks::kDefaultTriggerAsyncId];
// If defaultTriggerAsyncId isn't set, use the executionAsyncId
if (default_trigger_async_id < 0)
default_trigger_async_id = execution_async_id();
return default_trigger_async_id;
}
This time we are using kDefaultTriggerAsyncId as the index. In our case:
(lldb) expr default_trigger_async_id
(double) $36 = -1
So execution_async_id()
will be called which does:
return async_hooks()->async_id_fields()[AsyncHooks::kExecutionAsyncId];
Next in AsyncReset
we have:
switch (provider_type()) {
#define V(PROVIDER) \
case PROVIDER_ ## PROVIDER: \
TRACE_EVENT_NESTABLE_ASYNC_BEGIN2("node.async_hooks", \
#PROVIDER, static_cast<int64_t>(get_async_id()), \
"executionAsyncId", \
static_cast<int64_t>(env()->execution_async_id()), \
"triggerAsyncId", \
static_cast<int64_t>(get_trigger_async_id())); \
break;
NODE_ASYNC_PROVIDER_TYPES(V)
#undef V
default:
UNREACHABLE();
}
(lldb) expr provider_type()
(node::AsyncWrap::ProviderType) $39 = PROVIDER_TCPSERVERWRAP
Lets expand the macro for the provider:
case PROVIDER_TCPSERVERWRAP:
TRACE_EVENT_NESTABLE_ASYNC_BEGIN2("node.async_hooks",
TCPSERVERWRAP, static_cast<int64_t>(get_async_id()),
"executionAsyncId",
static_cast<int64_t>(env()->execution_async_id()),
"triggerAsyncId",
static_cast<int64_t>(get_trigger_async_id()));
break;
This add a v8 tracing event. TODO: take a closer look at this.
Next, we have the following:
if (silent) return;
EmitAsyncInit(env(), object(),
env()->async_hooks()->provider_string(provider_type()),
async_id_, trigger_async_id_);
EmitAsyncInit
has the following code:
Local<Function> init_fn = env->async_hooks_init_function();
This is the function that we looked at earlier. We can see the arguments being created:
Local<Value> argv[] = {
Number::New(env->isolate(), async_id),
type,
Number::New(env->isolate(), trigger_async_id),
object,
};
And these match the parameters that init
takes:
init uid: 6 , provider: TCPSERVERWRAP , parentUid: 1 , parentHandle: TCP { reading: false, owner: null, onread: null, onconnection: null }
And the function will execute the code generated from emitInitNative
which will call all the init
functions that have been registered.
So where/when are the init functions added to the active_hooks.array ?
Well, if we take a look at lib/internal/async_hooks.js we find:
const active_hooks = {
// Array of all AsyncHooks that will be iterated whenever an async event fires.
array: [],
call_depth: 0,
tmp_array: null,
tmp_fields: null
};
active_hooks
is accessed by calling setHookArrays()
in lib/async_hooks.js, called in enable
and disable
.
Let's take a look at enable
first:
const [hooks_array, hook_fields] = getHookArrays();
// Each hook is only allowed to be added once.
if (hooks_array.includes(this))
return this;
const prev_kTotals = hook_fields[kTotals];
hook_fields[kTotals] = 0;
// createHook() has already enforced that the callbacks are all functions,
// so here simply increment the count of whether each callbacks exists or
// not.
hook_fields[kTotals] += hook_fields[kInit] += +!!this[init_symbol];
hook_fields[kTotals] += hook_fields[kBefore] += +!!this[before_symbol];
hook_fields[kTotals] += hook_fields[kAfter] += +!!this[after_symbol];
hook_fields[kTotals] += hook_fields[kDestroy] += +!!this[destroy_symbol];
hook_fields[kTotals] += hook_fields[kPromiseResolve] += +!!this[promise_resolve_symbol];
hooks_array.push(this);
if (prev_kTotals === 0 && hook_fields[kTotals] > 0) {
enableHooks();
}
return this;
hook_fields[kTotals] is first set to 0. Not sure why the first line is doing +=
as it would
only be adding zero. Could that not just be:
hook_fields[kTotals] = hook_fields[kInit] += +!!this[init_symbol];
But lets take a look of the rest of this expression.
hook_fields[kInit] += +!!this[init_symbol];
hook_fields[kInit] is a counter of all the init hooks. This is going to add either 0 or 1 depending
on if there is a function for the init_symbol
. Notice the +
unary operator sign before double ! operators.
It will try to convert that boolean to a number. So if an init function was defined this will add 1 to
hooks_fields[kInit] otherwise 0.
Next, we see that this AsyncHook instance is added to the hooks_array. This answers the above question
regarding here the init fuctions are added to the hooks array.
Following this, lets take a look at enableHooks() which can be found in lib/internal/async_wrap.js:
function enableHooks() {
enablePromiseHook();
async_hook_fields[kCheck] += 1;
}
enablePromiseHook
can be found in src/async_wrap.cc:
static void EnablePromiseHook(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
env->AddPromiseHook(PromiseHook, static_cast<void*>(env));
}
This will land us in env.cc
AddPromiseHook
:
void Environment::AddPromiseHook(promise_hook_func fn, void* arg) {
auto it = std::find_if(
promise_hooks_.begin(), promise_hooks_.end(),
[&](const PromiseHookCallback& hook) {
return hook.cb_ == fn && hook.arg_ == arg;
});
if (it != promise_hooks_.end()) {
it->enable_count_++;
return;
}
promise_hooks_.push_back(PromiseHookCallback{fn, arg, 1});
if (promise_hooks_.size() == 1) {
isolate_->SetPromiseHook(EnvPromiseHook);
}
}
Every environment has a vector of PromiseHookCallbacks (env.h):
struct PromiseHookCallback {
promise_hook_func cb_;
void* arg_;
size_t enable_count_;
};
std::vector<PromiseHookCallback> promise_hooks_;
Next, remember that std::find_if
will return an iterator to the first element for which the
predicate returns true. If nothing is found the function returns last/end.
So if the passed in promise_hook_func
which is a typedef for a function pointer:
typedef void (*promise_hook_func) (v8::PromiseHookType type,
v8::Local<v8::Promise> promise,
v8::Local<v8::Value> parent,
void* arg);
if the fn and the arg already match an existing PromiseHookCallback, the iterator
will not be equal to end() and in that case that PromiseHookCallback will have
it's enable_count incremented and return. If the passed in promise_hook_func and args
have not been added then a new PromiseHookCallback will be created and added to
promise_hooks_
.
Last thing that happens in AddPromiseHook
is SetPromiseHook
is called on the
isolate. The function EnvPromiseHook is a promise hook that will run all the
promise_hook_'s added.
So when will EnvPromiseHook
be called?
Lets take a closer look at the V8 side of this. If we look in deps/v8/include/v8.h
we
can find:
enum class PromiseHookType { kInit, kResolve, kBefore, kAfter };
typedef void (*PromiseHook)(PromiseHookType type, Local<Promise> promise,
Local<Value> parent);
So we can see that PromiseHook is a function pointer to a function that takes a type,
promise, and a parent value.
This callback will be invoked by Isolate::RunPromiseHook
:
void Isolate::RunPromiseHook(PromiseHookType type, Handle<JSPromise> promise,
Handle<Object> parent) {
if (debug()->is_active()) debug()->RunPromiseHook(type, promise, parent);
if (promise_hook_ == nullptr) return;
promise_hook_(type, v8::Utils::PromiseToLocal(promise), v8::Utils::ToLocal(parent));
}
The first time this is called is from deps/v8/src/runtime/runtime-promise.cc
and:
RUNTIME_FUNCTION(Runtime_PromiseHookInit) {
HandleScope scope(isolate);
DCHECK_EQ(2, args.length());
CONVERT_ARG_HANDLE_CHECKED(JSPromise, promise, 0);
CONVERT_ARG_HANDLE_CHECKED(Object, parent, 1);
isolate->RunPromiseHook(PromiseHookType::kInit, promise, parent);
return isolate->heap()->undefined_value();
}
V8 provides four hooks, init
, resolve
, before
, and after
.
Init is run when a new Promise is created in V8. resolve
when a promise is resolved.
before
is run before a PromiseReactionJob, and after
after that job.
Lets say we create the following JavaScript:
const p = new Promise((resolve, reject) => {
resolve('ok');
});
If I'm reading this correctly, this would invoke code generated by v8/src/builtins/builtins-promise-gen.cc
:
TF_BUILTIN(PromiseConstructor, PromiseBuiltinsAssembler) {
...
GotoIfNot(IsPromiseHookEnabledOrDebugIsActive(), &debug_push);
CallRuntime(Runtime::kPromiseHookInit, context, instance, UndefinedConstant());
}
So, Runtime_PromiseHookInit
would only be called if a promise hook was enabled. Lets see if we can force this:
(lldb) br s -n ExecuteScript
(lldb) r
(lldb) expr ((v8::internal::Isolate*)env->isolate())->promise_hook_or_debug_is_active_
(bool) $8 = false
(lldb) expr ((v8::internal::Isolate*)env->isolate())->promise_hook_or_debug_is_active_ = true
(bool) $9 = true
(lldb) br s -f runtime-promise.cc -l 113
(lldb) continue
Indeed, we can now see that Runtime_PromiseHookInit
is getting called and it will in turn call
Isolate::RunPromiseHook
, but in this case promise_hook_ will be a nullptr as it was not set
it will simply return.
When a PromiseHook as been set on the isolate it will call EnvPromiseHook
EnvPromiseHook
will iterate of all the added promise_hook_s in the environment
and call there:
Environment* env = Environment::GetCurrent(promise->CreationContext());
for (const PromiseHookCallback& hook : env->promise_hooks_) {
hook.cb_(type, promise, parent, hook.arg_);
}
Recall, that the cb_
in this case it the callback added from async_wrap.cc
by
EnablePromiseHook
which was called from enableHooks()
in lib/internal/async_wrap.js
:
env->AddPromiseHook(PromiseHook, static_cast<void*>(env));
So, lets take a look a PromiseHook
. For this we need to update our example to use both async_hooks and
a promise:
const http = require('http')
const ah = require('async_hooks');
let asyncObject = {
init: function(uid, provider, parentUid, parentHandle) {
process._rawDebug('init uid:', uid, ', provider:', provider, ', parentUid:', parentUid);
},
before: function(uid) {
process._rawDebug('before uid:', uid);
},
after: function(uid, didThrow) {
process._rawDebug('after. uid:', uid, 'didThrow:', didThrow);
},
destroy: function(uid) {
process._rawDebug('destroy: uid:', uid);
},
promiseResolve: function(uid) {
process._rawDebug('promiseResolve: uid:', uid);
}
};
let asynchook = ah.createHook(asyncObject);
asynchook.enable();
const p = new Promise((resolve, reject) => {
resolve('ok');
});
p.then(msg => {
console.log(msg);
});
We should now be able to set a breakpoint in PromiseHook
:
(lldb) br s -f async_wrap.cc -l 288
(lldb) r
Just to recap, Runtime_PromiseHookInit
in deps/v8/src/runtime/runtime-promise.cc
will
call our PromiseHook
:
isolate->RunPromiseHook(PromiseHookType::kInit, promise, parent);
And Isolate::RunPromiseHook
will call the registered hook, which is EnvPromiseHook
, which
iterates over the registered PromiseHookCallback
(which is a struct containing the callback,
the arg and a counter) calling the struct's callback (which is PromiseHook):
static void PromiseHook(PromiseHookType type, Local<Promise> promise,
Local<Value> parent, void* arg) {
Environment* env = static_cast<Environment*>(arg);
Local<Value> resource_object_value = promise->GetInternalField(0);
(lldb) expr promise
(v8::Local<v8::Promise>) $67 = (val_ = 0x00007fff5fbfce10)
(lldb) expr promise->State()
(v8::Promise::PromiseState) $12 = kPending
(lldb) expr promise->HasHandler()
(bool) $11 = false
v8::Promise
can be found in deps/v8/include/v8.h. A v8::Promise also have a Result() function
which can be called if the state is not pending. And a Catch and Then function to register function
handlers. So this is the promise that was passed up from V8.
GetInternalField
is a function of v8::Object. For kInit there will not be any thing in the internal field (at least
not during this debugging session) but later when PromiseWrap::New
is called:
wrap = PromiseWrap::New(env, promise, nullptr, silent);
So, lets take a closer look at PromiseWrap::New
:
Local<Object> object = env->promise_wrap_template()->NewInstance(env->context()).ToLocalChecked();
object->SetInternalField(PromiseWrap::kPromiseField, promise);
object->SetInternalField(PromiseWrap::kIsChainedPromiseField,
parent_wrap != nullptr ?
v8::True(env->isolate()) :
v8::False(env->isolate()));
CHECK_EQ(promise->GetAlignedPointerFromInternalField(0), nullptr);
promise->SetInternalField(0, object);
return new PromiseWrap(env, object, silent);
As the name of this class indicates this will wrap a v8::Promise. So we first create
a new instance of the template that was created previously in SetupHooks
. The wrapping
is done by setting the promise on this object as an internal field (kPromiseField).
Also note that the object is set as an internal field on the promise, as index 0. This is
later accessed in PromiseHook
Local<Value> resource_object_value = promise->GetInternalField(0);
PromiseWrap* wrap = nullptr;
if (resource_object_value->IsObject()) {
Local<Object> resource_object = resource_object_value.As<Object>();
wrap = Unwrap<PromiseWrap>(resource_object);
}
So resource_object_value
can contain PromiseWrap
and if so it is unwrapped.
When a parent promise exists a DefaultTriggerAsyncIdScope is used before calling PromiseWrap::New
:
AsyncHooks::DefaultTriggerAsyncIdScope trigger_scope(parent_wrap);
The constructor that takes a AsyncWrap
can be found in src/async_wrap-inl.h
:
inline Environment::AsyncHooks::DefaultTriggerAsyncIdScope ::DefaultTriggerAsyncIdScope(AsyncWrap* async_wrap)
: DefaultTriggerAsyncIdScope(async_wrap->env(),
async_wrap->get_async_id()) {}
So this constructor just delegates to the one that takes an Environment pointer and a double. This constructor
can be found in src/env-inl.h
.
Recall that this following SetAccessor
was set on the template:
promise_wrap_template->SetAccessor(FIXED_ONE_BYTE_STRING(env->isolate(), "promise"), PromiseWrap::GetPromise);
So accessing promise
on object
above will invoke PromiseWrap::GetPromise
:
info.GetReturnValue().Set(info.Holder()->GetInternalField(kPromiseField));
Next, a new PromiseWrap is created using this object. And since PromiseWrap
extends AsyncWrap it's constructor will
call AsyncReset
:
AsyncReset(-1, silent);
which will call AsyncWrap::EmitAsyncInit
:
Local<Value> argv[] = {
Number::New(env->isolate(), async_id),
type,
Number::New(env->isolate(), trigger_async_id),
object,
};
USE(init_fn->Call(env->context(), object, arraysize(argv), argv));
These are the arguments that will be passed to init
:
init uid: 6 , provider: PROMISE , parentUid: 1 , parentHandle: PromiseWrap { isChainedPromise: false, promise: Promise { <pending> } }
In lib/internal/async_hooks.js we have these two fields:
const async_wrap = process.binding('async_wrap');
const { async_hook_fields, async_id_fields } = async_wrap;
When process.binding('async_wrap')
is called this will invoke AsyncWrap::Initialize
:
#define FORCE_SET_TARGET_FIELD(obj, str, field) \
(obj)->DefineOwnProperty(context, \
FIXED_ONE_BYTE_STRING(isolate, str), \
field, \
ReadOnlyDontDelete).FromJust()
// Attach the uint32_t[] where each slot contains the count of the number of
// callbacks waiting to be called on a particular event. It can then be
// incremented/decremented from JS quickly to communicate to C++ if there are
// any callbacks waiting to be called.
FORCE_SET_TARGET_FIELD(target, "async_hook_fields", env->async_hooks()->fields().GetJSArray());
This will expand to:
target->DefineOwnProperty(context,
FIXED_ONE_BYTE_STRING(isolate, "async_hook_fields"),
env->async_hooks()->fields().GetJSArray(),
ReadOnlyDontDelete).FromJust()
Notice that the field here is env->async_hooks()->fields().GetJSArray()
. If we look at the fields function of
AsyncHooks
we see:
inline AliasedBuffer<uint32_t, v8::Uint32Array>& fields();
So the native type And inspecting it we can see:
(lldb) expr *this
(node::AliasedBuffer<unsigned int, v8::Uint32Array>) $6 = {
isolate_ = 0x0000000105007400
count_ = 8
byte_offset_ = 0
buffer_ = 0x0000000104a1c590
js_array_ = {
v8::PersistentBase<v8::Uint32Array> = (val_ = 0x000000010505ba20)
}
free_buffer_ = true
}
When an Environment is created the AsyncHooks constructor will be called as it is a member of Environment (src/env.h):
AsyncHooks async_hooks_;
AsyncHooks constructor (src/env-inl.h) will set construct a AliasedBuffer for the the async_id_fields_ member:
fields_(env()->isolate(), kFieldsCount),
In the constructor for AliasedBuffer we can see that a buffer is created:
buffer_ = Calloc<NativeT>(count)
So we are going to allocate zeroed out memory for an unsigned int (buffer_ pointing to it)
(lldb) expr count
(size_t) $11 = 8
(lldb) expr buffer_
(unsigned int *) $13 = 0x0000000106000000
Next, we are going to create a V8 ArrayBuffer using the newly allocated block of memory:
v8::Local<v8::ArrayBuffer> ab = v8::ArrayBuffer::New(isolate_, buffer_, sizeInBytes);
(lldb) jlh ab
0x24f52a506c11: [JSArrayBuffer]
- map = 0x24f5fbb02c01 [FastProperties]
- prototype = 0x24f522e89819
- elements = 0x24f5eb102251 <FixedArray[0]> [HOLEY_ELEMENTS]
- embedder fields: 2
- backing_store = 0x106000000
- byte_length = 32
- external
- neuterable
- properties = 0x24f5eb102251 <FixedArray[0]> {}
- embedder fields = {
0x0
0x0
}
We can see that the backing_store
is pointing to the memory that was allocated.
An ArrayBuffer is an object that represents a block of data. A view is used to
access the data, which are called TypedArray views. This is what we create next:
v8::Local<V8T> js_array = V8T::New(ab, byte_offset_, count);
(lldb) expr byte_offset_
(size_t) $31 = 0
(lldb) expr count
(size_t) $32 = 8
(lldb) jlh js_array
0x24f52a506c61: [JSTypedArray]
- map = 0x24f5fbb03011 [FastProperties]
- prototype = 0x24f522e8ac51
- elements = 0x24f52a506ca9 <FixedUint32Array[8]> [UINT32_ELEMENTS]
- embedder fields: 2
- buffer = 0x24f52a506c11 <ArrayBuffer map = 0x24f5fbb02c01>
- byte_offset = 0
- byte_length = 32
- length = 8
- properties = 0x24f5eb102251 <FixedArray[0]> {}
- elements = 0x24f52a506ca9 <FixedUint32Array[8]> {
0-7: 0
}
- embedder fields = {
0x0
0x0
}
Next in AsyncHooks contructor we have the following line:
fields_[kCheck] = 1;
AliasedBuffer
has overloaded the [] operator so this call will invoke AliasedBuffer::operator[]:
Reference operator[](size_t index) {
return Reference(this, index);
}
And Reference
in turn overloads the = operator so it will be invoked:
template <typename T>
inline Reference& operator=(const T& val) {
aliased_buffer_->SetValue(index_, val);
return *this;
}
Reference
is used to store the index for the element being used (plus the
AliasedBuffer pointer).
So we can see that we are setting index_ which is kCheck, to value which is 1:
(lldb) expr index_
(size_t) $57 = 6
(lldb) expr ::Fields::kCheck
(int) $60 = 6
(lldb) expr val
(const int) $61 = 1
So we can get and set values using:
(lldb) expr fields_.SetValue(::Fields::kCheck, 2)
(lldb) expr fields_.GetValue(::Fields::kCheck)
(unsigned int) $69 = 2
(lldb) expr fields_
(node::AliasedBuffer<unsigned int, v8::Uint32Array>) $72 = {
isolate_ = 0x0000000105007400
count_ = 8
byte_offset_ = 0
buffer_ = 0x0000000106000000
js_array_ = {
v8::PersistentBase<v8::Uint32Array> = (val_ = 0x0000000105057a20)
}
free_buffer_ = true
}
(lldb) memory read -f x -s 4 -c 7 0x0000000106000000
0x106000000: 0x00000000 0x00000000 0x00000000 0x00000000
0x106000010: 0x00000000 0x00000000 0x00000002
(lldb) expr fields_.SetValue(::Fields::kCheck, 1)
(lldb) memory read -f x -s 4 -c 7 0x0000000106000000
0x106000000: 0x00000000 0x00000000 0x00000000 0x00000000
0x106000010: 0x00000000 0x00000000 0x00000001
So we bacially have an 32 bit array which have 8 values in indexed using AsyncHooks::Fields enum. These are basically counters as far as I can tell. The entries are used to keep track of the number of init functions that should be called. Looking at the code we can see that
Getting back on track we were in AsyncWrap::Initialize
:
target->DefineOwnProperty(context,
FIXED_ONE_BYTE_STRING(isolate, "async_hook_fields"),
env->async_hooks()->fields().GetJSArray(),
ReadOnlyDontDelete).FromJust()
So looking again at this like we can see that we are retreiving the JSTypedArray and setting that
on the target as async_hook_fields
(lldb) expr env->async_hooks()->fields().GetJSArray()->Buffer()->GetContents()
(v8::ArrayBuffer::Contents) $302 = {
data_ = 0x0000000106000000
byte_length_ = 32
allocation_base_ = 0x0000000106000000
allocation_length_ = 32
allocation_mode_ = kNormal
}
Notice that the data_
field points to the same memory location as buffer_
above.
Next constants
property is populated.
AliasedBuffer
AliasedBuffer
has overloaded the [] operator so this call will invoke AliasedBuffer::operator[]:
Reference operator[](size_t index) {
return Reference(this, index);
}
And Reference
in turn overloads the = operator so it will be invoked:
template <typename T>
inline Reference& operator=(const T& val) {
aliased_buffer_->SetValue(index_, val);
return *this;
}
Reference
is used to store the index for the element being used (plus the
AliasedBuffer pointer).
v8::Isolate* isolate_;
size_t count_;
size_t byte_offset_;
NativeT* buffer_;
v8::Global<V8T> js_array_;
bool free_buffer_;
HandleWrap
HandleWrap represents a libuv handle which represents . Take the following functions:
static void Close(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Ref(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Unref(const v8::FunctionCallbackInfo<v8::Value>& args);
static void HasRef(const v8::FunctionCallbackInfo<v8::Value>& args);
There are libuv counter parts for these in uv_handle_t:
void uv_close(uv_handle_t* handle, uv_close_cb close_cb)
void uv_ref(uv_handle_t* handle)
void uv_unref(uv_handle_t* handle)
int uv_has_ref(const uv_handle_t* handle)
Just like in libuv where uv_handle_t is a base type for all libuv handles, HandleWrap is a base class for all Node.js Wrap classes.
Every uv_handle_t can have a data member, and this is being set in the constructor to this instance of HandleWrap.
handle__->data = this;
HandleScope scope(env->isolate());
Wrap(object, this);
In HandleWrap's constructor the HandleWrap is added to the queue of HandleWraps in the Environment:
env->handle_wrap_queue()->PushBack(this);
libuv has the following types of handle types:
#define UV_HANDLE_TYPE_MAP(XX) \
XX(ASYNC, async) \
XX(CHECK, check) \
XX(FS_EVENT, fs_event) \
XX(FS_POLL, fs_poll) \
XX(HANDLE, handle) \
XX(IDLE, idle) \
XX(NAMED_PIPE, pipe) \
XX(POLL, poll) \
XX(PREPARE, prepare) \
XX(PROCESS, process) \
XX(STREAM, stream) \
XX(TCP, tcp) \
XX(TIMER, timer) \
XX(TTY, tty) \
XX(UDP, udp) \
XX(SIGNAL, signal) \
struct uv_tcp_s {
UV_HANDLE_FIELDS
UV_STREAM_FIELDS
UV_TCP_PRIVATE_FIELDS
};
We know that TCPWrap is a built-in module and that it's Initialize method is called, which sets up all the prototype functions available,
among them listen
:
env->SetProtoMethod(t, "listen", Listen);
And in Listen
we find:
int backlog = args[0]->Int32Value();
int err = uv_listen(reinterpret_cast<uv_stream_t*>(&wrap->handle_),
backlog,
OnConnection);
args.GetReturnValue().Set(err);
We can find a similarity in Node where TCPWrap indirectly also extends StreamWrap (which extends HandleWrap).
Wrap
template <typename TypeName>
void Wrap(v8::Local<v8::Object> object, TypeName* pointer) {
CHECK_EQ(false, object.IsEmpty());
CHECK_GT(object->InternalFieldCount(), 0);
object->SetAlignedPointerInInternalField(0, pointer);
}
Here we can see that we are setting a pointer in field 0. The object
in question, and pointer
the pointer to this HandleWrap.
persistent().Reset will destroy the underlying storage cell if it is non-empty, and create a new one the handle.
MakeWeak:
inline void MakeWeak(void) {
persistent().SetWeak(this, WeakCallback, v8::WeakCallbackType::kParameter);
persistent().MarkIndependent();
}
The above is installing a finalization callback on the persistent object. Marking the persistent object as independant means that the GC is free to ignore object groups containing this persistent object. Why is this done? I don't know enough about the V8 GC yet to answer this.
The callback may be called (best effort) and it looks like this:
static void WeakCallback(const v8::WeakCallbackInfo<ObjectWrap>& data) {
ObjectWrap* wrap = data.GetParameter();
assert(wrap->refs_ == 0);
wrap->handle_.Reset();
delete wrap;
}
ContextifyScript will call MakeWeak
in it's constructor:
ContextifyScript(Environment* env, Local<Object> object) : BaseObject(env, object) {
MakeWeak();
}
So we are calling MakeWeak
so that a callback (a finalizer) will be called when the GC has determined that there are not more
refs to the object.
If we take a look at MakeWeak
:
void BaseObject::MakeWeak() {
persistent_handle_.SetWeak(
this,
[](const v8::WeakCallbackInfo<BaseObject>& data) {
std::unique_ptr<BaseObject> obj(data.GetParameter());
// Clear the persistent handle so that ~BaseObject() doesn't attempt
// to mess with internal fields, since the JS object may have
// transitioned into an invalid state.
// Refs: https://github.com/nodejs/node/issues/18897
obj->persistent_handle_.Reset();
}, v8::WeakCallbackType::kParameter);
}
From looking at the code the callback is called after the
And the destructor looks like this now:
BaseObject::~BaseObject() {
env_->RemoveCleanupHook(DeleteMe, static_cast<void*>(this));
if (persistent_handle_.IsEmpty()) {
// This most likely happened because the weak callback below cleared it.
return;
}
{
v8::HandleScope handle_scope(env_->isolate());
object()->SetAlignedPointerInInternalField(0, nullptr);
}
}
Notice the call to persitent_handle_.IsEmpty(), so if it is emtpy we will not do anything. So could this callback could be avoided completely? Making something a weak pointer will allow it to be GC'd but you might not require any callback to be invoked. In that case you can just call:
persistent_handle_.SetWeak();
Remember, obj
is of type BaseObject and is not managed by V8, but the persistent_handle_
is managed by V8's GC. When we get the callback that the persistent_handle_ is
about to be freed. If we don't call Reset
, then ~BaseObject
will try to set
the internal field on the now freed persistent_handle_
. object()
calls and
SetAlignedPointerInInternalField
will segfault as the handle has already been freed.
v8::Local<v8::Object> BaseObject::object() const {
return PersistentToLocal(env_->isolate(), persistent_handle_);
}
My understanding of this is that when the callback/finalizer lambda is called the underlying Persistent object will have been freed, but the BaseObject
instance still has a reference to it. It uses this reference in ~ObjectBaset()
create a new Local, and then calls SetAlignedPointerInInternalField
which will segfault when trying to OpenHandle (which has been feed).
So, what does Reset do?
V8::DisposeGlobal(reinterpret_castinternal::Object**(this->val_)); i::GlobalHandles::Destroy(location); global-handles.cc:94
(lldb) expr persistent_handle_ (node::Persistentv8::Object) $2 = { v8::PersistentBasev8::Object = (val_ = 0x0000000108005fa0) }
TcpWrap
TcpWrap extends ConnectionWrap Lets take a look at the creation of a TcpWrap:
wrap = new TCPWrap(env, args.This(), nullptr);
What is args.This(). That will be the (v8::Localv8::Object) object that will be wrapped.
This be passed to ConnectionWrap's constructor, which in turn will pass it to StreamWrap's constructor, which will pass it to HandleWrap's constructor, which will pass it to AsyncWrap's constructor, which will pass it to BaseObject's constructor which will set this/create a persistent object to store the handle:
: persistent_handle_(env->isolate(), handle)
I've not seen this before, initializing a member with two parameters, and I cannot find a function that matches this signature. What is going on there?
Well, the type of persistent_handle_
is :
v8::Persistent<v8::Object> persistent_handle_;
And the constructor for Persistent looks like this:
template <class S>
V8_INLINE Persistent(Isolate* isolate, Local<S> that)
: PersistentBase<T>(PersistentBase<T>::New(isolate, *that)) {
TYPE_CHECK(T, S);
}
IsolateData
Has a public constructor that takes a pointer to Isolate, a pointer to uv_loop_t, and a pointer to uint32 zero_fill_field. An IsolateData instance also has a number of public methods:
#define VP(PropertyName, StringValue) V(v8::Private, PropertyName, StringValue)
#define VS(PropertyName, StringValue) V(v8::String, PropertyName, StringValue)
#define V(TypeName, PropertyName, StringValue) \
inline v8::Local<TypeName> PropertyName(v8::Isolate* isolate) const;
PER_ISOLATE_PRIVATE_SYMBOL_PROPERTIES(VP)
PER_ISOLATE_STRING_PROPERTIES(VS)
#undef V
#undef VS
#undef VP
What is happening here is that we are declaring methods for each for the PER_ISOLATE_PRIVATE_SYMBOL_PROPERTIES. Since VP is being passed and the type for those methods is v8::Private there will be the following methods:
v8::Local<Private> alpn_buffer_private_symbol(v8::Isolate* isolate) const;
v8::Local<Private> arrow_message_private_symbol(v8::Isolate* isolate) const;
...
But what is the StringValue used for?
The StringValue is actually not used here, see #7905 for details.
The StringValue is used in the definition though which can be found in src/env-inl.h:
inline IsolateData::IsolateData(v8::Isolate* isolate, uv_loop_t* event_loop,
uint32_t* zero_fill_field)
:
#define V(PropertyName, StringValue) \
PropertyName ## _( \
isolate, \
v8::Private::New( \
isolate, \
v8::String::NewFromOneByte( \
isolate, \
reinterpret_cast<const uint8_t*>(StringValue), \
v8::NewStringType::kInternalized, \
sizeof(StringValue) - 1).ToLocalChecked())),
PER_ISOLATE_PRIVATE_SYMBOL_PROPERTIES(V)
#undef V
#define V(PropertyName, StringValue) \
PropertyName ## _( \
isolate, \
v8::String::NewFromOneByte( \
isolate, \
reinterpret_cast<const uint8_t*>(StringValue), \
v8::NewStringType::kInternalized, \
sizeof(StringValue) - 1).ToLocalChecked()),
PER_ISOLATE_STRING_PROPERTIES(V)
#undef V
This is the definition of the IsolateData constructor, and it is setting each of the private member fields to the StringValue. I created an example to try this out. While it might not be easy on the eyes this does have a major advantage of not having to maintain all of these accessor methods. Adding a new one is simply a matter of adding an entry to the macro.
So now that we understand the macro, lets take a look at the actual information that this class stores/provides.
All the property accessors defined above are available using he IsolateData instance but also they can be called using an Environment instance which just passes the calls through to the IsolateData instance.
The per isolate private members are the following:
V(alpn_buffer_private_symbol, "node:alpnBuffer")
V(npn_buffer_private_symbol, "node:npnBuffer")
V(selected_npn_buffer_private_symbol, "node:selectedNpnBuffer")
The above are used by node_crypto.cc which makes sense as Application Level Protocol Negotiation (ALPN) is an TLS protocol, as it Next Prototol Negotiation (NPN).
V(arrow_message_private_symbol, "node:arrowMessage")
Not sure exactly what this does but from a quick search it looks like it has to do with exception handling and printing of error messages. TODO: revisit this later.
An IsolateData (and also an Environement as it proxies these members) actually has a lot of members, too many to list here it is easy to do a search for them.
Running lint
$ make lint
$ make jslint
Run lint os one file:
$ ./tools/node_modules/eslint/bin/eslint.js --rulesdir=tools/eslint-rules --ext=.js,.mjs,.md test/sequential/test-benchmark-tls.js
Running tests
To run the test use the following command:
$ make -j4 test
The -j is the number of processes to use.
Mac firewall exceptions
On mac you might find it popping up dialogs about the firwall blocking access to the node
and cctest
applications when running
the tests. There is a script in node/tools
that can run to add rules to the firewall:
$ sudo tools/macosx-firewall.sh
Running a script
This section attempts to explain the process of running a javascript file. We will create a break point in the javascript source and see how it is executed.
$ ./node -inspect-brk
Next, start lldb
and
$ lldb -- out/Debug/node --inspect-brk test/parallel/test-tcp-wrap-connect.js
Now, when a script is executed it will be read and loaded. Where is this done?
To recap the loading is done by LoadEnvironment
which loads and executes lib/internal/bootstrap/node.js
. This is a function which is
then executed:
Local<Value> arg = env->process_object();
f->Call(Null(env->isolate()), 1, &arg);
As we can see the process_object which was configured earlier is passed into the function:
(function(process) {
function startup() {
...
}
//other functions
startup();
});
We can see that the startup
function will be called when the the f
is called. Since we are specifying a script to run we will
be looking at setting up the various object in the environment, mosty using the passed in process object (TODO: need to write out the
details for this later) and eventually running:
preloadModules();
run(Module.runMain);
Module.runMain
is a function in lib/module.js
:
// bootstrap main module.
Module.runMain = function() {
// Load the main module--the command line argument.
Module._load(process.argv[1], null, true);
// Handle any nextTicks added in the first tick of the program
process._tickCallback();
};
_load
Will check the module cache for the filename and if it already exists just returns the exports object for this module. But otherwise
the filename will be loaded using the file extension. Possible extensions are .js
, .json
, and .node
(defaulting to .js if no extension is given).
Module._extensions[extension](this, filename);
We know our extension is .js
so lets look closer at it:
// Native extension for .js
Module._extensions['.js'] = function(module, filename) {
var content = fs.readFileSync(filename, 'utf8');
module._compile(internalModule.stripBOM(content), filename);
};
So lets take a look at _compile_
module._compile
After removing the shebang from the content
which is passed in as the first parameter the content is wrapped:
var wrapper = Module.wrap(content);
var compiledWrapper = vm.runInThisContext(wrapper, {
filename: filename,
lineOffset: 0,
displayErrors: true
});
vm.runInThisContext
:
var dirname = path.dirname(filename);
var require = internalModule.makeRequireFunction.call(this);
var args = [this.exports, require, this, filename, dirname];
var depth = internalModule.requireDepth;
if (depth === 0) stat.cache = new Map();
var result = compiledWrapper.apply(this.exports, args);
Module.wrap
This is declared as:
const NativeModule = require('native_module');
....
Module.wrap = NativeModule.wrap;
NativeModule can be found lib/internal/bootstrap/node.js:
NativeModule.wrap = function(script) {
return NativeModule.wrapper[0] + script + NativeModule.wrapper[1];
};
NativeModule.wrapper = [
'(function (exports, require, module, __filename, __dirname) { ',
'\n});'
];
We can see here that the content of our JavaScript file will be included/wrapped in
(function (exports, require, module, __filename, __dirname) {
// script content
});'
So this is also how exports
, require
, module
, __filename
, and __dirname
are made available
to all scripts.
So, to recap we have a wrapper
instance that is a function. The next thing that happens in lib/modules.js
is:
var compiledWrapper = vm.runInThisContext(wrapper, {
filename: filename,
lineOffset: 0,
displayErrors: true
});
So what does vm.runInThisContext
do?
This is defined in lib/vm.js
:
exports.runInThisContext = function(code, options) {
var script = new Script(code, options);
return script.runInThisContext(options);
};
As described in the vm the vm module provides APIs for compiling and running code within V8 Virtual Machine contexts. Creating a new Script will compile but not run the code. It can later be run multiple times.
So what is a Script? It is declared as:
const binding = process.binding('contextify');
const Script = binding.ContextifyScript;
What is Contextify about?
This is related to V8 contexts and all JavaScript code is run in a context.
src/node_contextify.cc
is a builtin module and contains an Init
function that does the following (among other things):
env->SetProtoMethod(script_tmpl, "runInContext", RunInContext);
env->SetProtoMethod(script_tmpl, "runInThisContext", RunInThisContext);
script.runInThisContext
in vm.js overrides runInThisContext
and then delegates to src/node_contextify.cc RunInThisContext
.
// Do the eval within this context
Environment* env = Environment::GetCurrent(args);
EvalMachine(env, timeout, display_errors, break_on_sigint, args, &try_catch);
After all this processing is done we will be back in node.cc and continue processing there. As everything is event driven the event loop start running and trigger callbacks for anything that has been set up by the script. Just think about a V8 example you create yourself, you set up the c++ code that is to be called from JavaScript and then V8 takes care of the rest. In node, the script is first wrapped in node specific JavaScript and then executed. Node code uses libuv there are callbacks setup that are called by libuv and more actions taken, like invoking a JavaScript callback function.
Remove need to specify a no-operation immediate_idle_handle
When calling setImmediate, this will schedule the callback passed in to be scheduled for execution after I/O events:
setImmediate(function() {
console.log("In immediate...");
});
Currently this is done by using a libuv uv_check_handle. Since checks are performed after polling for I/O, if there are no idle handle or prepare handle (need to check this) then the I/O polling would block as there would be nothing for the event loop to process until there is an I/O event. But if we have an idle handler there is something for the event loop to do which will cause the poll timeout to be zero and the event loop will not block on I/O.
In src/node.cc there is currently an empty uv_idle_handle callback (IdleImmediateDummy) for this which could be removed if it was possible to pass in a NULL callback. Currently there is a check in libuv checcing if the callback is null and this might not be able to change.
My first idea was to overload the function but C does not suppport overloading, so perhaps having a new function named something like:
uv_idle_start_nop(&handle)
Another option might be to make uv_idle_start an varargs function and if the only one argument is passed (not null but actually missing) then assume that a nop-callback. But looking into a variadic function there is no way to know when there if an argument was provided or not (of the optional arguments that is). Currently I'm only adding a function for uv_idle_start_nop to uv-common.c to try this out and see if I can get some feedback on a better place for this.
This task did not come to anything yet. Perhaps with libuv 2.0 libuv might accepts a null callback.
tcp_wrap and pipe_wrap
Lets take a look at the following statement:
var TCPConnectWrap = process.binding('tcp_wrap').TCPConnectWrap;
var req = new TCPConnectWrap();
We know from before that binding
is set as function on the process object. This was done in SetupProcessObject in node.cc:
env->SetMethod(process, "binding", Binding);
So we are invoking the Binding function in node.cc with the argument 'tcp_wrap':
static void Binding(const FunctionCallbackInfo<Value>& args) {
Binding will extract the first (and only) argument which is the name of the module. Every environment seems to have a cache, and if the module is in this cache it is returned:
Local<Object> cache = env->binding_cache_object();
It will also create a instance of Local<Object> exports which is the object that will be returned.
Local<Object> exports;
So, when the tcp_wrap.cc was Initialized (see section about Builtins):
// Create FunctionTemplate for TCPConnectWrap.
auto constructor = [](const FunctionCallbackInfo<Value>& args) {
CHECK(args.IsConstructCall());
};
auto cwt = FunctionTemplate::New(env->isolate(), constructor);
cwt->InstanceTemplate()->SetInternalFieldCount(1);
cwt->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"));
target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"), cwt->GetFunction());
What is going on here. We create a new FunctionTemplate using the constructor
lamba, this is then added to the target (the object that we are initializing).
The constructor is only checking that the passed in args can be used as a constructor (using new in JavaScript)
The object returned from the constructor call does not have any methods as far as I can tell.
The constructor would late be used like this:
var client = new TCP();
var req = new TCPConnectWrap();
var err = client.connect(req, '127.0.0.1', this.address().port);
Now, we saw that TCP has a bunch of methods set up in Initialize, one of the being connect:
void TCPWrap::Connect(const FunctionCallbackInfo<Value>& args) {
...
Local<Object> req_wrap_obj = args[0].As<Object>();
This is the instance of TCPConnectWrap req
created above and we can see that it is of type v8::Local<v8::Local>
.
ConnectWrap* req_wrap = new ConnectWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_TCPCONNECTWRAP);
Remember that ConnectWrap
extends ReqWrap
which extends AsyncWrap
.
We know that ConnectWrap takes Local<Object>
as the req_wrap_obj
err = uv_tcp_connect(req_wrap->req(), &wrap->handle_, reinterpret_cast<const sockaddr*>(&addr), AfterConnect);
uv_tcp_connect
takes a pointer uv_connect_t
and a pointer to uv_tcp_t
handle. This will connect to the specified sockaddr_in
and the
callback will be called when the connection has been established or if an error occurs. So it makes sense that ConnectWrap extends ReqWrap
as uv_connect_t is a request type in libuv:
/* Request types. */
typedef struct uv_req_s uv_req_t;
typedef struct uv_getaddrinfo_s uv_getaddrinfo_t;
typedef struct uv_getnameinfo_s uv_getnameinfo_t;
typedef struct uv_shutdown_s uv_shutdown_t;
typedef struct uv_write_s uv_write_t;
typedef struct uv_connect_s uv_connect_t; <--------------------
typedef struct uv_udp_send_s uv_udp_send_t;
typedef struct uv_fs_s uv_fs_t;
typedef struct uv_work_s uv_work_t;
AsyncWrap extends BaseObject which
(lldb) p *this
(node::BaseObject) $28 = {
persistent_handle_ = {
v8::PersistentBase<v8::Object> = (val_ = 0x0000000105010c60)
}
env_ = 0x00007fff5fbfe108
}
So each BaseObject instance has a v8::Persistentv8:Object. This is a persistent object as it needs to be preserved accross C++ function boundries. Also, we can see that each BaseObject instance also has a node::Environment associated with it. The only thing that BaseObject's constructor does (baseobject-inl-h) is :
// The zero field holds a pointer to the handle. Immediately set it to
// nullptr in case it's accessed by the user before construction is complete.
if (handle->InternalFieldCount() > 0)
handle->SetAlignedPointerInInternalField(0, nullptr);
So after we have returned to AsyncWraps constructor, and then ReqWrap's we are back in ConnectWrap's constructor:
Wrap(req_wrap_obj, this);
Wrap
in util-inl.h:
template <typename TypeName>
void Wrap(v8::Local<v8::Object> object, TypeName* pointer) {
CHECK_EQ(false, object.IsEmpty());
CHECK_GT(object->InternalFieldCount(), 0);
object->SetAlignedPointerInInternalField(0, pointer);
}
We are now setting index 0 to the pointer which is the current
Object is the v8::Local<v8::Object>
, the one we created in our JavaScript file and passed to the connect method named req
:
var req = new TCPConnectWrap();
var err = client.connect(req, '127.0.0.1', this.address().port);
So we are setting/storing a pointer to the ConnectWrap instance at index 0 of the req_wrap_obj
.
After all that we are ready to make the uv_tcp_connect call:
err = uv_tcp_connect(req_wrap->req(), &wrap->handle_, reinterpret_cast<const sockaddr*>(&addr), AfterConnect);
We can see the callback is node::ConnectionWrap<node::TCPWrap, uv_tcp_s>::AfterConnect(uv_connect_s*, int)
Notice that target is of type Local<Object>.
Local<Object> exports;
....
exports = Object::New(env->isolate());
...
mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);
exports
is what is returned to the caller.
args.GetReturnValue().Set(exports);
And we access the TCPConnectWrap member, which is a function which can be used as a constructor by using new. Lets start with where is ConnectWrap called? It is called from tcp_wrap.cc and its Connect method. ConnectWrap extends ReqWrap which extends AsyncWrap which extens BaseObject
req.oncomplete = function(status, client_, req_) {
So, we know from earlier that our req
object is basically empty. Here we are setting a property name oncomplete
to be
a function. This will be called in connection_wrap.cc 111:
req_wrap->MakeCallback(env->oncomplete_string(), arraysize(argv), argv);
oncomplete_string() is a generated method from a macro in env.h
v8::Local<v8::Value> cb_v = object()->Get(symbol);
CHECK(cb_v->IsFunction());
return MakeCallback(cb_v.As<v8::Function>(), argc, argv);
object()
will return the persistent object to out handle (from base-object-inl.h) :
return PersistentToLocal(env_->isolate(), persistent_handle_);
We can see that the persistent_handle_
is the handle that was created using which makes sense as this
is the object that oncomplete was created for:
var req = new TCPConnectWrap();
We are then calling Get(symbol) which will be a Symbol representing 'oncomplete'. And the calling it with number of arguments, and the arguments themselves.
tcp_wrap.cc
In OnConnect
I found the following:
TCPWrap* tcp_wrap = static_cast<TCPWrap*>(handle->data);
....
Local<Object> client_obj = Instantiate(env, static_cast<AsyncWrap*>(tcp_wrap));
class TCPWrap : public StreamWrap
class StreamWrap : public HandleWrap, public StreamBase
class HandleWrap : public AsyncWrap {
As far as I can tell TCPWrap
is of type AsyncWrap
. Looking at src/pipe_wrap.cc
which has a very similar OnConnect method
(which I'm going to take a stab at refactoring) but does not have this cast.
Refactoring tcpwrap and pipewrap
This comment exist on pipewrap OnConnect:
// TODO(bnoordhuis) maybe share with TCPWrap?
void PipeWrap::OnConnection(uv_stream_t* handle, int status) {
PipeWrap* pipe_wrap = static_cast<PipeWrap*>(handle->data);
CHECK_EQ(&pipe_wrap->handle_, reinterpret_cast<uv_pipe_t*>(handle));
The reinterpret_cast operator changes one data type into another. Recall how the types of libuv have a type of c inheritance allowing casting.
/*
* uv_pipe_t is a subclass of uv_stream_t.
*
* Representing a pipe stream or pipe server. On Windows this is a Named
* Pipe. On Unix this is a Unix domain socket.
*/
struct uv_pipe_s {
UV_HANDLE_FIELDS
UV_STREAM_FIELDS
int ipc; /* non-zero if this pipe is used for passing handles */
UV_PIPE_PRIVATE_FIELDS
};
The main difference that I've been able to find is in pipewrap status
is checked:
if (status != 0) {
pipe_wrap->MakeCallback(env->onconnection_string(), arraysize(argv), argv);
return;
}
src/stream_wrap.cc
Looking into a task where the public member field req_ in src/req_wrap.cc is to be made private, I came accross the following method:
286 void StreamWrap::AfterShutdown(uv_shutdown_t* req, int status) {
287 ShutdownWrap* req_wrap = ContainerOf(&ShutdownWrap::req_, req);
288 HandleScope scope(req_wrap->env()->isolate());
289 Context::Scope context_scope(req_wrap->env()->context());
290 req_wrap->Done(status);
291 }
What I did for the public req_ member is made it private and then added a public accessor method for it. This was easy to update in most places but in src/stream_wrap.cc we have the following line:
ShutdownWrap* req_wrap = ContainerOf(&ShutdownWrap::req_, req);
Extracting AfterConnect into connection_wrap.cc
Just like OnConnect
was extracted into connection_wrap and shared by both tcp_wrap and pipe_wrap the same should be done for AfterConnect
.
The main difference I found was in PipeWrap::AfterConnect
:
bool readable, writable;
if (status) {
readable = writable = 0;
} else {
readable = uv_is_readable(req->handle) != 0;
writable = uv_is_writable(req->handle) != 0;
}
Local<Object> req_wrap_obj = req_wrap->object();
Local<Value> argv[5] = {
Integer::New(env->isolate(), status),
wrap->object(),
req_wrap_obj,
Boolean::New(env->isolate(), readable),
Boolean::New(env->isolate(), writable)
};
AfterConnect is a callback that is passed to uv_pipe_connect. The status will be 0 if uv_connect() was successful and < 0 otherwise.
The thing to notice is the difference compared to tcp_wrap:
Local<Object> req_wrap_obj = req_wrap->object();
Local<Value> argv[5] = {
Integer::New(env->isolate(), status),
wrap->object(),
req_wrap_obj,
v8::True(env->isolate()),
v8::True(env->isolate())
};
TCPWrap always sets the readable and writable values to true where as PipeWrap checks if the handle is readble/writeble. Seems like a the TCPWrap will always be both readable and writable.
Making ReqWrap req_ member private
Currently the member req_ is public in src/req
One issue when doing this was that after renaming req_ to req() I had to rename a macro in src/node_file.cc to avoid a collision with the macro parameter with the same name.
The second issue I ran into was with src/stream_wrap.cc:
void StreamWrap::AfterShutdown(uv_shutdown_t* req, int status) {
ShutdownWrap* req_wrap = ContainerOf(&ShutdownWrap::req_, req);
We can find ContainerOf
in src/util-inl.h :
template <typename Inner, typename Outer>
inline ContainerOfHelper<Inner, Outer> ContainerOf(Inner Outer::*field, Inner* pointer) {
return ContainerOfHelper<Inner, Outer>(field, pointer);
}
The call in question is auto-deducing the paremeter types from the arguments, it could also have been explicit:
ShutdownWrap* req_wrap = ContainerOf<uv_shutdown_t*, ShutdownWrap>(&ShutdownWrap::req_, req);
ContainerOfHelper
src/util.h
declares a class named ContainerOfHelper:
// The helper is for doing safe downcasts from base types to derived types.
template <typename Inner, typename Outer>
class ContainerOfHelper {
public:
inline ContainerOfHelper(Inner Outer::*field, Inner* pointer);
template <typename TypeName>
inline operator TypeName*() const;
private:
Outer* const pointer_;
};
So back to our call using ContainerOf which will invoke:
template <typename Inner, typename Outer>
ContainerOfHelper<Inner, Outer>::ContainerOfHelper(Inner Outer::*field, Inner* pointer)
: pointer_(reinterpret_cast<Outer*>(reinterpret_cast<uintptr_t>(pointer) - reinterpret_cast<uintptr_t>(&(static_cast<Outer*>(0)->*field)))) {
}
First, note that the parameter field
is a pointer-to-member, which gives the offset of the member within the class object as opposed to using the
address-of operator on a data member bound to an actual class object which yields the member's actual address in memory.
uintptr_t
is an unsigned int that is capable of storing a pointer. Such a type can be used when you need to perform integer operations on a pointer.
reinterpret_cast is a compiler directive which instructs the compiler to treat the sequence
of bits as if it had the new type:
reinterpret_cast<uintptr_t>(pointer)
reinterpret_cast is used to convert any pointer type to any other pointer type and the result is a binary copy of the value.
reinterpret_cast<uintptr_t>(&(static_cast<ShutdownWrap*>(0)->*field))
I've not seen this usage before using 0 as the argument to static_cast:
static_cast<ShutdownWrap*>(0)->*field)
The static_cast part of this expression will give a nullptr, but we are not accessing a member, but a pointer-to-member which remember is the offset.
A pointer is only a memory address but the type of the object determines how a pointer can be used, like using a member it needs to know the offsets
of those members.
So we creating a pointer to Outer which by using the offset of the field and substracting that from pointer
. So when using a pointer and dereferencing
field
this will point to same value of pointer
Why does the protected field req_ have to be last:
Command: out/Release/node /Users/danielbevenius/work/nodejs/node/test/parallel/test-child-process-stdio-big-write-end.js
--- CRASHED (Signal: 10) ---
=== release test-cluster-disconnect ===
Path: parallel/test-cluster-disconnect
/Users/danielbevenius/work/nodejs/node/out/Release/node[84341]: ../src/connection_wrap.cc:83:static void node::ConnectionWrap<node::TCPWrap, uv_tcp_s>::AfterConnect(uv_connect_t *, int) [WrapType = node::TCPWrap, UVType = uv_tcp_s]: Assertion `(req_wrap->env()) == (wrap->env())' failed.
1: node::Abort() [/Users/danielbevenius/work/nodejs/node/out/Release/node]
2: node::RunMicrotasks(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/danielbevenius/work/nodejs/node/out/Release/node]
3: node::ConnectionWrap<node::TCPWrap, uv_tcp_s>::AfterConnect(uv_connect_s*, int) [/Users/danielbevenius/work/nodejs/node/out/Release/node]
4: uv__stream_io [/Users/danielbevenius/work/nodejs/node/out/Release/node]
5: uv__io_poll [/Users/danielbevenius/work/nodejs/node/out/Release/node]
6: uv_run [/Users/danielbevenius/work/nodejs/node/out/Release/node]
7: node::Start(int, char**) [/Users/danielbevenius/work/nodejs/node/out/Release/node]
8: start [/Users/danielbevenius/work/nodejs/node/out/Release/node]
9: 0x2
"req_wrap_queue_ needs to be at a fixed offset from the start of the struct because it is used by ContainerOf to calculate the address of the embedding ReqWrap. ContainerOf compiles down to simple, fixed pointer arithmetic. sizeof(req_) depends on the type of T, so req_wrap_queue_ would no longer be at a fixed offset if it came after req_."
This is what ReqWrap currently looks like:
private:
friend class Environment;
ListNode<ReqWrap> req_wrap_queue_;
Notice that this is not a pointer and when a ReqWrap instance is created the ListNode::ListNode() constructor will be called:
template <typename T>
ListNode<T>::ListNode() : prev_(this), next_(this) {}
So every instance will have it's own doubly link linked list and each entry contains a ReqWrap instance which has a type T member. Depending on the type of T the size of the ReqWrap object in memory will be different. So it would not be possible to have req_wrap_queue after req_, or req_ before req_wrap_queue as this would make the offset different during runtime (compile time would still work fine).
Every Environment instance has the following queues:
HandleWrapQueue handle_wrap_queue_;
ReqWrapQueue req_wrap_queue_;
And a typedef for this is created using a pointer-to-member:
typedef ListHead<ReqWrap<uv_req_t>, &ReqWrap<uv_req_t>::req_wrap_queue_> ReqWrapQueue;
Each time a instance of ReqWrap is created that instance will be added to the queue:
env->req_wrap_queue()->PushBack(reinterpret_cast<ReqWrap<uv_req_t>*>(this));
Share AfterWrite with with udp_wrap and stream_wrap
So, the task is basically to follow this comment in udb_wrap.cc:
// TODO(bnoordhuis) share with StreamWrap::AfterWrite() in stream_wrap.cc
void UDPWrap::OnSend(uv_udp_send_t* req, int status) {
At first glance this don't look that similar that they could be shared:
void UDPWrap::OnSend(uv_udp_send_t* req, int status) {
SendWrap* req_wrap = static_cast<SendWrap*>(req->data);
if (req_wrap->have_callback()) {
Environment* env = req_wrap->env();
HandleScope handle_scope(env->isolate());
Context::Scope context_scope(env->context());
Local<Value> arg[] = {
Integer::New(env->isolate(), status),
Integer::New(env->isolate(), req_wrap->msg_size),
};
req_wrap->MakeCallback(env->oncomplete_string(), 2, arg);
}
delete req_wrap;
}
have_callback()
is a method on the SendWrap class and does not exist for WriteWrap.
void StreamWrap::AfterWrite(uv_write_t* req, int status) {
WriteWrap* req_wrap = WriteWrap::from_req(req);
CHECK_NE(req_wrap, nullptr);
HandleScope scope(req_wrap->env()->isolate());
Context::Scope context_scope(req_wrap->env()->context());
req_wrap->Done(status);
}
First thing to notice is the checking for a callback, StreamWrap::AfterWrite seems to assume that there will
always be a callback by looking at req_wrap->Done
:
inline void Done(int status, const char* error_str = nullptr) {
Req* req = static_cast<Req*>(this);
Environment* env = req->env();
if (error_str != nullptr) {
req->object()->Set(env->error_string(), OneByteString(env->isolate(), error_str));
}
cb_(req, status);
}
When DoShutdown
is called the last thing that is done is:
req_wrap->Dispatched();
which will set req_.data = this; this being the Shutdown wrap instance. Later when the AfterShutdown
method is called that instance will be available
by using the req->data.
Stream class hierarchy
class TTYWrap : public StreamWrap
class PipeWrap : public ConnectionWrap<PipeWrap, uv_pipe_t>
class TCPWrap : public ConnectionWrap<TCPWrap, uv_tcp_t>
class ConnectionWrap : public StreamWrap
class StreamWrap : public HandleWrap, public StreamBase
class HandleWrap : public AsyncWrap
class AsyncWrap : public BaseObject
class BaseObject
class StreamBase : public StreamResource
class StreamResource
Wrapped
var TCP = process.binding('tcp_wrap').TCP;
var TCPConnectWrap = process.binding('tcp_wrap').TCPConnectWrap;
var ShutdownWrap = process.binding('stream_wrap').ShutdownWrap;
var client = new TCP();
var shutdownReq = new ShutdownWrap();
This above will invoke the constructor set up by TCPWrap::Initialize
:
auto constructor = [](const FunctionCallbackInfo<Value>& args) {
CHECK(args.IsConstructCall());
};
auto cwt = FunctionTemplate::New(env->isolate(), constructor);
cwt->InstanceTemplate()->SetInternalFieldCount(1);
SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"));
Set(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"), GetFunction());
The only thing the constructor does is check that new
is used with the function (as in new ShutdownWrap).
var err = client.shutdown(shutdownReq);
The methods available to a TCP instance are also configured in TCPWrap::Initialize
. The shutdown
method is set up using the
following call:
StreamWrap::AddMethods(env, t, StreamBase::kFlagHasWritev);
src/stream_base-inl.h
contains the shutdown method:
env->SetProtoMethod(t, "shutdown", JSMethod<Base, &StreamBase::Shutdown>);
So we are using a referece to StreamBase::Shutdown which can be found in src/stream_base.cc:
int StreamBase::Shutdown(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CHECK(args[0]->IsObject());
Local<Object> req_wrap_obj = args[0].As<Object>();
ShutdownWrap* req_wrap = new ShutdownWrap(env,
req_wrap_obj,
this,
AfterShutdown);
The Shutdown constructor delegates to ReqWrap:
ReqWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_SHUTDOWNWRAP),
Which delegates to AsyncWrap:
AsyncWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_SHUTDOWNWRAP),
Which delegates to BaseObject:
BaseObject(env, req_wrap_obj)
req_wrap_obj is refered to handle in BaseObject and is made into a persistent V8 handle
AfterShutdown
is of type typedef void (DoneCb)(Req req, int status). This callback is passed to the constructor of
StreamReq:
StreamReq<ShutdownWrap>(cb)
This will simply store the callback in a private field.
The StreamBase
instance (this
in the call above) will be set as a private member of Shutdown wrap.
There is a single function call in the constructor which is:
Wrap(req_wrap_obj, this);
void Wrap(v8::Local<v8::Object> object, TypeName* pointer) {
CHECK_EQ(false, object.IsEmpty());
CHECK_GT(object->InternalFieldCount(), 0);
object->SetAlignedPointerInInternalField(0, pointer);
}
So we are setting the ShutdownWrap instance pointer on the V8 local object. So wrap means that we are wrapping the ShutdownWrap instance in the req_warp_obj.
Compiling the test in this project
First step is that Google Test needs to be added. Follow the steps in "Adding Google test to the project" before proceeding.
Building and running the tests
make check
Clean
make clean
Adding Google test to the project
Build the gtest lib:
$ mkdir lib
$ mkdir deps ; cd deps
$ git clone git@github.com:google/googletest.git
$ cd googletest/googletest
$ mkdir build ; cd build
$ c++ -std=gnu++0x -stdlib=libstdc++ -I`pwd`/../include -I`pwd`/../ -pthread -c `pwd`/../src/gtest-all.cc
$ ar -rv libgtest.a gtest-all.o
$ cp libgtest.a ../../../../lib
We will be linking against Node.js which is build (on mac) using c++ and using the GNU Standard library.
Before OS X 10.9.x the default was libstdc++, but after OS X 10.9.x the default is libc++. I'm ususing 10.11.5 so the default would be libc++ in my case. I ran into an issue when compiling and not explicitely specifying -stdlib=libstdc++
as this would mix two different standard library implementations.
Instead of our program crashing at runtime we get a link time error. libc++ uses a C++11 language feature called inline namespace to change the ABI of std::string without impacting the API of std::string. That is, to you std::string looks the same. But to the linker, std::string is being mangled as if it is in namespace std::__1. Thus the linker knows that std::basic_string and std::__1::basic_string are two different data structures (the former coming from gcc's libstdc++ and the latter coming from libc++).
Writing a test file
$ mkdir test
$ vi main.cc
#include "gtest/gtest.h"
#include "base-object_test.cc"
int main(int argc, char* argv[]) {
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
$ vi base-object_test.cc
#include "gtest/gtest.h"
TEST(BaseObject, base) {
}
then compile using:
$ clang++ -I`pwd`/../deps/googletest/googletest/include -pthread main.cc ../lib/libgtest.a -o base-object_test
Run the test:
./base-object_test
use of undeclared identifier 'node'
After making sure that I can include the 'base-object.h' header I get the following error when compiling:
In file included from test/main.cc:2:
test/base-object_test.cc:9:3: error: use of undeclared identifier 'node'
node::BaseObject bo;
^
1 error generated.
make: *** [test/base-object_test] Error 1
After taking a closer look at src/base-object.j
I noticed this line:
#if defined(NODE_WANT_INTERNALS) && NODE_WANT_INTERNALS
I've not set this in the test, so there is not much being included by the preprocessor.
Adding #define NODE_WANT_INTERNALS 1
should fix this.
Default implicit destructor
When working on a task involving extracting commmon code to a superclass I caused an issue with the CI builds.
What I had originally done was added an empty destructor:
~ConnectionWrap() {
}
I later changed this to be an explicitly defaulted destructor generated by the compiler
~ConnectionWrap() = default;
While I did not see any failures on my local machine during development the CI server did. I currently don't have more information than this but will try to gather some.
My understanding/assumption was that these two would be equivalent. So what is doing on? Let's start by taking a look a the inheritance tree and the various destructors:
class ConnectionWrap : public StreamWrap
protected:
~ConnectionWrap() {}
class StreamWrap : public HandleWrap, public StreamBase
protected:
~StreamWrap() { }
class HandleWrap : public AsyncWrap
protected:
~HandleWrap() override;
class AsyncWrap : public BaseObject
public:
inline virtual ~AsyncWrap();
class BaseObject
public:
inline virtual ~BaseObject();
class StreamBase : public StreamResource
public:
virtual ~StreamBase() = default;
class StreamResource
public:
virtual ~StreamResource() = default;
From looking at the error:
In file included from ../src/pipe_wrap.h:7:0,
from ../src/pipe_wrap.cc:1:
../src/connection_wrap.h:26:3: internal compiler error: in use_thunk, at cp/method.c:338
~ConnectionWrap() = default;
^
Please submit a full bug report,
with preprocessed source if appropriate. See file:///usr/share/doc/gcc-4.8/README.Bugs for instructions. Preprocessed source stored into /tmp/ccvbqQz3.out file, please attach this to your bugreport. ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/_usr_lib_gcc_x86_64-linux-gnu_4.8_cc1plus.1000.crash' make[2]: *** [/home/iojs/build/workspace/node-test-commit-linux/nodes/ubuntu1204-64/out/Release/obj.target/node/src/pipe_wrap.o] Error 1
it looks like GCC (G++) 4.8 is being used. The reason for asking is I did a search and found a few indications that this might be a bug in the compiler. This is reported as sovled in 4.8.3 which is also why I'm curious about the compiler version. The centos machines use devtoolset-2, which comes with g++ 4.8.2.
If a class has no user-declared destructor, one is declared implicitly by the compiler and is called an implicitly-declared destructor. An implicitly-declared destructor is inline. Another aspect about destructors that is important to understand is that even if the body of a destructor is empty, it doesn’t mean that this destructor won’t execute any code. The C++ compiler augments the destructor with calls to destructors for bases and non-static data members
Chrome debugger
Open developer tools from Chrome CMD+OPT+I
Debugging
CMD+;
step into
CMD+'
step over
CMD+SHIFT+;
step out
CMD+\
continue
CTRL+.
next call frame
CTRL+,
previous call frame
CMD+B
toggle breakpoint
CTRL+SHIFT+E
run highlighted snipped and show output in console.
Searching
CMD+F
search current file
CMD+ALT+F
search all sources
CMD+P
go to source file. Opens a dialog where you can type in a file name
CTRL+G
go to line
Editor
SHIFT+CMD+P
go to member
ESC
toggle drawer
CTRL+~
jump to console
CMD+[
next panel
CMD+]
previous panel
CMD+ALT+[
next panel in history
CMD+ALT+]
previous panel history
CMD+SHIFT+D
toggle location of panels (separate screen/docked)
?
show settings dialog
You can see all the shortcuts from here
ESC
close settings/dialog
Node Package Manager (NPM)
I was curious about what type of program it is. Looking at the shell script on my machine it is a simple wrapper that calls node with the javascript file being the shell script itself. Kinda like doing:
#!/bin/sh
// 2>/dev/null; exec "`dirname "~/.nvm/versions/node/v4.4.3/bin/npm"`/node" "$0" "$@"
console.log("bajja");
Make
GNU make has two phases. During the first phase it reads all the makefiles, and internalizes all variables. Make will expand any variables or functions in that section as the makefile is parsed. This is called immediate expansion since this happens during the first phase. The expansion is called deferred if it is not performed immediately.
Take a look at this rule:
config.gypi: configure
if [ -f $@ ]; then
$(error Stale $@, please re-run ./configure)
else
$(error No $@, please run ./configure first)
fi
The recipe in this case is a shell if statement, which is a deferred construct. But the control function $(error)
is an immediate construct which will cause the makefile processing to stop processing.
If I understand this correctly the only possible outcome of this rule is the Stale config.gypi message which will be done in the first phase and then exit. The shell condition will not be considered.
For example, if we delete config.gypi
we would expect the result to be an error saying that No config.gypi, please run ./configure first
. But the result is:
Makefile:81: *** Stale config.gypi, please re-run ./configure. Stop.
Keep in mind that config.gypi
is not a .PHONY target, so it is a file on the file system and if it is missing the recipe will be run.
So we could use a simple echo statement and and exit to work around this:
config.gypi: configure
@if [ -f $@ ]; then \
echo Stale $@, please re-run ./$<; \
else \
echo No $@, please run ./$< first; \
fi
@exit 1;
But that will produce the kind of ugly result:
$ make config.gypi
Stale config.gypi, please re-run ./configure
make: *** [config.gypi] Error 1
AtExit
// TODO(bnoordhuis) Turn into per-context event.
4278 void RunAtExit(Environment* env) {
What exactly is a AtExit function. An "AtExit" hook is a function that is invoked after the Node.js event loop has ended but before the JavaScript VM is terminated and Node.js shuts down.
So in node.cc you can find:
void AtExit(void (*cb)(void* arg), void* arg) {
This would be called like this:
static void callback(void* arg) {
}
AtExit(callback);
static AtExitCallback* at_exit_functions_;
I notices that AtExist is declared in node.h:
NODE_EXTERN void RunAtExit(Environment* env);
NODE_EXTERN
is declared as:
So the idea is that at_exit_functions_ should be a per-environment property rather than a global. Like bnoordhuis pointed out, AtExit does not take a pointer to an Environment but we have to add the callbacks to the Environment associated with the addon. Is the environemnt available when the addons init function is called?
To answer that question, what is the type contained in the init function of an addon?
void init(Local<Object> target) {
AtExit(at_exit_cb1, target->CreationContext()->GetIsolate());
}
NODE_MODULE(binding, init);
So a user will still have to call AtExit but instead of node.cc holding a static linked list of callbacks to call these should be added to the current environment.
void AtExit(void (*cb)(void* arg), void* arg) {
So AtExit takes a function pointer as its first argument, and a void pointer as its second. The function pointer is to a function that returns void and takes a void pointer as an argument.
mp->nm_register_func(exports, module, mp->nm_priv);
The above call can be found in DLOpen
in src/node.cc`. The first thing that happens in DLOpen is:
Environment* env = Environment::GetCurrent(args);
I've covered the setting of the Environment in AssignToContext
previously. This is done by the Environment contructor and by
node_contextify.cc.
The only Start
function exposed in node.h is the one that takes argc
and argv
. Calling node::Start multiple times does
not work and result in the following error:
# Fatal error in ../deps/v8/src/isolate.cc, line 2021
# Check failed: thread_data_table_.
#
==== C stack trace ===============================
0 cctest 0x0000000100324fce v8::base::debug::StackTrace::StackTrace() + 30
1 cctest 0x0000000100325005 v8::base::debug::StackTrace::StackTrace() + 21
2 cctest 0x000000010031dd94 V8_Fatal + 452
3 cctest 0x0000000100cc053c v8::internal::Isolate::Isolate(bool) + 2092
4 cctest 0x0000000100cc0ad5 v8::internal::Isolate::Isolate(bool) + 37
5 cctest 0x0000000100370a59 v8::Isolate::New(v8::Isolate::CreateParams const&) + 41
6 cctest 0x000000010003323f node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 79
7 cctest 0x0000000100032e38 node::Start(int, char**) + 200
8 cctest 0x00000001000cdb33 EnvironmentTest_StartMultipleTimes_Test::TestBody() + 51
9 cctest 0x000000010014089a void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 122
10 cctest 0x00000001001190be void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 110
11 cctest 0x0000000100118fa5 testing::Test::Run() + 197
12 cctest 0x0000000100119f98 testing::TestInfo::Run() + 216
13 cctest 0x000000010011b227 testing::TestCase::Run() + 231
14 cctest 0x0000000100129ccc testing::internal::UnitTestImpl::RunAllTests() + 908
15 cctest 0x00000001001444aa bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 122
16 cctest 0x00000001001298be bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 110
17 cctest 0x00000001001297b5 testing::UnitTest::Run() + 373
18 cctest 0x0000000100147a81 RUN_ALL_TESTS() + 17
19 cctest 0x0000000100147a5b main + 43
20 cctest 0x00000001000010f4 start + 52
make: *** [cctest] Illegal instruction: 4
The Environment created when using the above start function is done in
inline int Start(Isolate* isolate, IsolateData* isolate_data,
int argc, const char* const* argv,
int exec_argc, const char* const* exec_argv) {
Would it be safe to use GetCurrent using the isolate in
Thread-local
static thread_local Environment* thread_local_env;
The object is allocated when the thread begins and deallocated when the thread ends. Each thread has its own instance of the object. Only objects declared thread_local have this storage duration. thread_local can appear together with static or extern to adjust linkage.
So we are specifying static only to specify that it should only have internal linkage, meaning that it can be referred to from all scopes in the current translation unit. It does not mean that it is static as in "static storage" meaning that it would be allocated when the program begins and deallocated when the program ends. But without the static linkage it would be external by default which is not what we want.
When used in a declaration of an object, it specifies static storage duration (except if accompanied by thread_local). When used in a declaration at namespace scope, it specifies internal linkage.
Using a `while(more == true)' :
0x1011d81a9 <+1177>: jmp 0x1011d81ae ; <+1182> at node.cc:4453
0x1011d81ae <+1182>: movb -0xd31(%rbp), %al ; move byte value of -0xd31(%rpb) (move variable) into al register
0x1011d81b4 <+1188>: andb $0x1, %al ; AND 1 and the content of move variable
0x1011d81b6 <+1190>: movzbl %al, %ecx ; conditional move into eax if zero
0x1011d81b9 <+1193>: cmpl $0x1, %ecx ; compare 1 and the contents of eax
0x1011d81bc <+1196>: je 0x1011d80e9 ; <+985> at node.cc:4437
0x1011d81c2 <+1202>: leaq -0xd30(%rbp), %rdi
0x1011d81c9 <+1209>: callq 0x1002214e0 ; v8::SealHandleScope::~SealHandleScope at api.cc:926
Compared to using `while(more)`:
0x1011d81a9 <+1177>: jmp 0x1011d81ae ; <+1182> at node.cc:4453
0x1011d81ae <+1182>: testb $0x1, -0xd31(%rbp) ; AND 1 and more
0x1011d81b5 <+1189>: jne 0x1011d80e9 ; <+985> at node.cc:4437
0x1011d81bb <+1195>: leaq -0xd30(%rbp), %rdi
0x1011d81c2 <+1202>: callq 0x1002214e0 ; v8::SealHandleScope::~SealHandleScope at api.cc:926
Calling conventions
Are the rules when making functions calls regarding how parameters are passed, who is responsible for cleaning up the stack, how the return value is to be retrieved, and also how the function calls are decorated.
cdecl
A calling convention that is used for standard C where the the stack must be cleaned up by the callee as there is support for varargs and there is now way for the called function to know the actual number of values pushed onto the stack before the function was called. Function name is decorated by prefixing it with an underscore character '_' .
stdcall
Here arguments are fixed and the called function can to the stack clean up. The advantage here is that the stack clean up code is only done once in one place. Function name is decorated by prepending an underscore character and appending a '@' character and the number of bytes of stack space required.
Issue
When running the Node.js build on windows (trying to get cctest to work for a test I added), I got the following link error:
env.obj : error LNK2001: unresolved external symbol
"public: __cdecl node::Utf8Value::Utf8Value(class v8::Isolate *,class v8::Local<class v8::Value>)" (??0Utf8Value@node@@QEAA@PEAVIsolate@v8@@V?$Local@VValue@v8@@@3@@Z) [c:\workspace\node-compile-windows\label\win-vs2015\cctest.vcxproj]
Now, we can see that the calling convention used is __cdecl
but the name mangling does not look correct as it is using @@
process_title.len = argv[argc - 1] + strlen(argv[argc - 1]) - argv[0];
This would be the same as :
(lldb) p (size_t) argv[argc-1] + (size_t) strlen(argv[argc-1]) - (size_t)argv[0]
(unsigned long) $9 = 56
When in my unit test the same gives me:
(lldb) p (size_t) argv[argc-1] + (size_t) strlen(argv[argc-1]) - (size_t)argv[0]
(unsigned long) $10 = 34693
What is happening is that we are taking the memory address of argv[argc-1] +
Debugging a Node addon
The task a hand was to debug Realm's addon to see why test were just hanging even though I made sure to call Tape test's end function. So, realm is a normal dependency and exist in node_modules.
Setup:
$ npm install --save realm
$ cd node_modules/realm
$ env REALMJS_USE_DEBUG_CORE=true node-pre-gyp install --build-from-source --debug
$ lldb -- node test/datastores/realm-store-test.js
(lldb) breakpoint set --file node_init.cpp --line 26
It turns out that when breaking in the debugger (CTRL+C) and then stepping through it was in a kevent and this migth be some kind of listerner for events, and there is a realm.removeAllListeners() that can be called and this solved my issue.
Generate Your Project
For node the various targets in node.gyp will generate make files in the out
directory.
For example the target named cctest
will generate out/cctest.target.mk file.
Profiling
You can use Google V8's built in profiler using the --prof
command line option:
$ out/Debug/node --prof test.js
This will generate a file in the current directory named something like isolate-0x104005e00-v8.log
.
Now we can process this file:
$ export D8_PATH=~/work/google/javascript/v8/out/x64.debug
$ deps/v8/tools/mac-tick-processor isolate-0x104005e00-v8.log
or you can use node's ---prof-process
option:
$ ./out/Debug/node --prof-process isolate-0x104005e00-v8.log
Statistical profiling result from isolate-0x104005e00-v8.log, (332 ticks, 86 unaccounted, 0 excluded).
The profiler is sample based so with wakes up and takes a sample. The intervals that is wakes up is called a tick. It will look at where the instruction pointer is RIP and reports the function if that function can be resolved. If cannot resolve the function this will be reported as an unaccounted tick.
[Summary]:
ticks total nonlib name
0 0.0% 0.0% JavaScript
236 71.1% 73.3% C++
4 1.2% 1.2% GC
10 3.0% Shared libraries
86 25.9% Unaccounted
We can see that 71.1%
of the time was spent in C++ code. Inspecting the C++ section you should be able to see were the most
time is being spent and the sources.
[C++]:
ticks total nonlib name
66 19.9% 20.5% node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
20 6.0% 6.2% node::Binding(v8::FunctionCallbackInfo<v8::Value> const&)
5 1.5% 1.6% v8::internal::HandleScope::ZapRange(v8::internal::Object**, v8::internal::Object**)
The [Bottom up]
section shows us which the primary callers of the above are:
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
86 25.9% UNKNOWN
66 19.9% node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
66 100.0% v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
66 100.0% LazyCompile: ~runInThisContext bootstrap_node.js:427:28
66 100.0% LazyCompile: ~NativeModule.compile bootstrap_node.js:509:44
66 100.0% LazyCompile: ~NativeModule.require bootstrap_node.js:443:34
15 22.7% LazyCompile: ~startup bootstrap_node.js:12:19
11 16.7% Function: ~<anonymous> module.js:1:11
8 12.1% Function: ~<anonymous> stream.js:1:11
7 10.6% LazyCompile: ~setupGlobalVariables bootstrap_node.js:192:32
6 9.1% Function: ~<anonymous> util.js:1:11
6 9.1% Function: ~<anonymous> tty.js:1:11
3 4.5% LazyCompile: ~setupGlobalTimeouts bootstrap_node.js:226:31
2 3.0% LazyCompile: ~createWritableStdioStream internal/process/stdio.js:134:35
2 3.0% Function: ~<anonymous> fs.js:1:11
2 3.0% Function: ~<anonymous> buffer.js:1:11
LazyCompile: Simply means that the function was complied lazily and not that this was the time spent compiling.
- before function name means that time is being spent in optimized function. ~ before a function means that is was not optimized.
The % in the parent column shows the percentage of samples for which the function in the row above was called by the function in the current row. So,
66 19.9% node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
66 100.0% v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
would be read as when v8::internal::Builting_HandleApiCall
was sampled it called node:ContextifyScript every time.
And
66 100.0% LazyCompile: ~NativeModule.require bootstrap_node.js:443:34
15 22.7% LazyCompile: ~startup bootstrap_node.js:12:19
that when startup in bootstrap_node.js was called, in 22% of the samples it called NativeModule.require.
[Shared libraries]:
ticks total nonlib name
6 1.8% /usr/lib/system/libsystem_kernel.dylib
2 0.6% /usr/lib/system/libsystem_platform.dylib
1 0.3% /usr/lib/system/libsystem_malloc.dylib
1 0.3% /usr/lib/system/libsystem_c.dylib
setTimeout
Let's take the following example:
setTimeout(function () {
console.log('bajja');
}, 5000);
$ ./out/Debug/node --inspect --inspect-brk settimeout.js
In Node you can call setTimeout with out having a require. This is done by lib/boostrap/node.js
:
function setupGlobalTimeouts() {
const timers = NativeModule.require('timers');
global.clearImmediate = timers.clearImmediate;
global.clearInterval = timers.clearInterval;
global.clearTimeout = timers.clearTimeout;
global.setImmediate = timers.setImmediate;
global.setInterval = timers.setInterval;
global.setTimeout = timers.setTimeout;
}
So we can see that we are able to call setTimout without having to require any module and that it is part of a native modules named timers. This is located in lib/timers.js.
The first thing that will happen is a new Timeout will be created in createSingleTimeout
. A timeout looks like:
function Timeout(after, callback, args) {
this._called = false;
this._idleTimeout = after; // this will be 5000 in our use-case
this._idlePrev = this;
this._idleNext = this;
this._idleStart = null;
this._onTimeout = callback; // this is our callback that just logs to the console
this._timerArgs = args;
this._repeat = null;
}
This timer
instance is then passed to active(timer)
which will insert the timer by calling insert
:
insert(item, false);
item
is the timer, and false is the value of the unrefed argument)
item._idleStart = TimerWrap.now();
So we can see that we are using timer_wrap which is located in src/timer_wrap.cc and the now function which is
initialized to:
```c++
env->SetTemplateMethod(constructor, "now", Now);
Back in the insert function we then have the following:
const lists = unrefed === true ? unrefedLists : refedLists;
We know that unrefed is false so lists will be the refedLists which is an object keyed with the millisecond that a timeout is due to expire. The value of each key is a linkedlist of timers that expire at the same time.
var list = lists[msecs];
If there are other timers that also expire after 5000ms then there might already be a list for them. But in this case there is not and a new list will be created:
lists[msecs] = list = createTimersList(msecs, unrefed);
const list = new TimersList(msecs, unrefed); // 5000 and false
function TimersList(msecs, unrefed) {
this._idleNext = null; // Create the list with the linkedlist properties to
this._idlePrev = null; // prevent any unnecessary hidden class changes.
this._timer = new TimerWrap();
this._unrefed = unrefed; // will be false in our case
this.msecs = msecs; // will be 5000 in our case
}
The new TimerWrap
call will invoke New
in timer_wrap.cc as setup in the initialize function:
Local<FunctionTemplate> constructor = env->NewFunctionTemplate(New);
New
will invoke TimerWrap's constructor which does:
int r = uv_timer_init(env->event_loop(), &handle_);
So we can see that it is setting up a libuv timer. Shortly after we have the following code (back in JavaScript land and lib/timers.js):
Next the list (TimerList) is initialized setting _idleNext and _idlePrev to list. After this we are adding a field to the list:
list._timer._list = list;
list._timer.start(msecs);
Start is initialized using :
env->SetProtoMethod(constructor, "start", Start);
static void Start(const FunctionCallbackInfo<Value>& args) {
TimerWrap* wrap = Unwrap<TimerWrap>(args.Holder());
CHECK(HandleWrap::IsAlive(wrap));
int64_t timeout = args[0]->IntegerValue();
int err = uv_timer_start(&wrap->handle_, OnTimeout, timeout, 0);
args.GetReturnValue().Set(err);
}
Compare this with timer.c. and you can see that these is not that much of a difference. Let's look at the callback OnTimeout:
static void OnTimeout(uv_timer_t* handle) {
TimerWrap* wrap = static_cast<TimerWrap*>(handle->data);
Environment* env = wrap->env();
HandleScope handle_scope(env->isolate());
Context::Scope context_scope(env->context());
wrap->MakeCallback(kOnTimeout, 0, nullptr);
}
The callback in question looks like:
0xa90abea6961: [Function]
- map = 0x2fecee786da1 [FastProperties]
- prototype = 0x1b3829484539
- elements = 0x2203fd302241 <FixedArray[0]> [FAST_HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0x28b2bad27aa1 <SharedFunctionInfo listOnTimeout>
- name = 0x28b2bad26b31 <String[13]: listOnTimeout>
- formal_parameter_count = 0
- context = 0x19bfe9663951 <FixedArray[48]>
- feedback vector cell = 0x28b2bad2a549 <Cell value= 0x2203fd302311 <undefined>>
- code = 0x26d0e2004941 <Code BUILTIN>
- properties = 0x2203fd302241 <FixedArray[0]> {
#length: 0x35bd747eed51 <AccessorInfo> (const accessor descriptor)
#name: 0x35bd747eedc1 <AccessorInfo> (const accessor descriptor)
#prototype: 0x35bd747eee31 <AccessorInfo> (const accessor descriptor)
}
Notice that the callback is listOnTimeout
and this can be found in lib/timer.js
.
setImmediate
The very simple JavaScript looks like this:
setImmediate(function () {
console.log('bajja');
});
Like setTimeout the implementation is found in lib/timers.js.
A new Immediate will be created in createImmediate
which looks like this:
function Immediate() {
// assigning the callback here can cause optimize/deoptimize thrashing
// so have caller annotate the object (node v6.0.0, v8 5.0.71.35)
this._idleNext = null;
this._idlePrev = null;
this._callback = null;
this._argv = null;
this._onImmediate = null;
this.domain = process.domain;
}
The following check will then be done:
if (!process._needImmediateCallback) {
process._needImmediateCallback = true;
process._immediateCallback = processImmediate;
}
In this case process._needImmediateCallback
is false so we'll enter the above block and set process._needImmediateCallback
to true
.
Also, notice that we are setting the processImmediate instance as a member of the process object.
processImmediate
is a function defined in timer.js. There is a V8 accessor for the field _immediateCallback
on the process object which is set up in node.cc (SetupProcessObject function):
auto need_immediate_callback_string =
FIXED_ONE_BYTE_STRING(env->isolate(), "_needImmediateCallback");
CHECK(process->SetAccessor(env->context(), need_immediate_callback_string,
NeedImmediateCallbackGetter,
NeedImmediateCallbackSetter,
env->as_external()).FromJust());
So when we do process_.immediateCallback
NeedImmediateCallbackSetter
will be invoked.
Looking closer at this function and comparing it with a libuv check example we should see some similarties.
uv_check_t* immediate_check_handle = env->immediate_check_handle();
uv_idle_t* immediate_idle_handle = env->immediate_idle_handle();
uv_check_start(immediate_check_handle, CheckImmediate);
// Idle handle is needed only to stop the event loop from blocking in poll.
uv_idle_start(immediate_idle_handle, IdleImmediateDummy);
So we can see that when this setter is called it will set up check handle (if the value
was true as in process._needImmediateCallback = true
).
When the check phase is reached the CheckImmediate
callback will be invoked. Lets set a breakpoint in that function and verify this:
(lldb) breakpoint set --file node.cc --line 286
static void CheckImmediate(uv_check_t* handle) {
Environment* env = Environment::from_immediate_check_handle(handle);
HandleScope scope(env->isolate());
Context::Scope context_scope(env->context());
MakeCallback(env, env->process_object(), env->immediate_callback_string());
}
Following MakeCallback
will will find ourselves in timers.js and its processImmediate
function which you might recall that we set:
process._immediateCallback = processImmediate;
immediate._callback = immediate._onImmediate;
immediate._onImmediate
will be our callback function (anonymous in setimmediate.js)
tryOnImmediate(immediate, tail);
will call:
runCallback(immediate);
will call:
return timer._callback();
And the callback is:
function () {
console.log('bajja');
}
And there we have how setImmediate works in Node.js.
process._nextTick
The very simple JavaScript looks like this:
process.nextTick(function () {
console.log('bajja');
});
nextTick
is defined in lib/internal/process/next_tick.js
.
After a few checks what happens is that the callback is added to the nextTickQueue:
nextTickQueue.push({
callback,
domain: process.domain || null,
args
});
nextTickQueue
is an array:
var nextTickQueue = [];
And we are pushing an object with the callback as a function named callback, domain and args. So for every nextTick called an entry will be added to the queue.
tickInfo[kLength]++;
Recall that TickInfo is an inner class of Environment. Lets back up a little. bootstrap/node.js
will call next_tick's setup() function from its start function:
NativeModule.require('internal/process/next_tick').setup();
exports.setup = setupNextTick;
var microtasksScheduled = false;
// Used to run V8's micro task queue.
var _runMicrotasks = {};
// *Must* match Environment::TickInfo::Fields in src/env.h.
var kIndex = 0;
var kLength = 1;
process.nextTick = nextTick;
// Needs to be accessible from beyond this scope.
process._tickCallback = _tickCallback;
process._tickDomainCallback = _tickDomainCallback;
// This tickInfo thing is used so that the C++ code in src/node.cc
// can have easy access to our nextTick state, and avoid unnecessary
// calls into JS land.
const tickInfo = process._setupNextTick(_tickCallback, _runMicrotasks);
process._setupNextTick
is initialized in SetupProcessObject
in src/node.cc:
env->SetMethod(process, "_setupNextTick", SetupNextTick);
Lets take a look at what SetupNextTick does...
env->set_tick_callback_function(args[0].As<Function>());
env->SetMethod(args[1].As<Object>(), "runMicrotasks", RunMicrotasks);
So, here we are setting a method named runMicrotasks
on the _runMicrotasks
object
passed to _setupNextTick
.
// Do a little housekeeping.
env->process_object()->Delete(
env->context(),
FIXED_ONE_BYTE_STRING(args.GetIsolate(), "_setupNextTick")).FromJust();
Looks like this removes the _setupNextTick function from the process object afterwards.
uint32_t* const fields = env->tick_info()->fields();
uint32_t const fields_count = env->tick_info()->fields_count();
What are 'fields'? What are 'fields_count'?
(lldb) p fields_count
(uint32_t) $23 = 2
Local<ArrayBuffer> array_buffer =
ArrayBuffer::New(env->isolate(), fields, sizeof(*fields) * fields_count);
args.GetReturnValue().Set(Uint32Array::New(array_buffer, 0, fields_count));
So tickInfo
returned will be an ArrayBuffer:
const tickInfo = process._setupNextTick(_tickCallback, _runMicrotasks);
Next we assign the RunMicroTasks
callback to the _runMicrotasks
variable:
_runMicrotasks = _runMicrotasks.runMicrotasks;
After this we are done in bootstrap/node.js and the setup of next_tick. So, lets continue and break in our script and follow process.setNextTick.
nextTickQueue.push({
callback,
domain: process.domain || null,
args
});
So we are again showing that we add callback info to the nextTickQueue (after a few checks) Then we do the following:
tickInfo[kLength]++;
For each object added to the nextTickQueue we will increment the second element of the tickInfo array.
And that is it, the stack frames will start returning and be poped off the call stack. What we
are interested in is in module.js and Module.runMain
:
process._tickCallback();
do {
while (tickInfo[kIndex] < tickInfo[kLength]) {
tock = nextTickQueue[tickInfo[kIndex]++];
...
_combinedTickCallback(args, callback);
if (kMaxCallbacksUntilQueueIsShortened < tickInfo[kIndex])
tickDone();
}
} while (tickInfo[kLength] !== 0);
The check is to see if tickInfo[kIndex] (is this the index of being processed?) is less than
the number of tick callbacks in the nextTickQueue
.
Next tickInfo[kIndex] is retrieved from the nextTickQueue and then tickInfo[kIndex] is incremented.
tickDone()
:
function tickDone() {
if (tickInfo[kLength] !== 0) {
if (tickInfo[kLength] <= tickInfo[kIndex]) {
nextTickQueue = [];
tickInfo[kLength] = 0;
} else {
nextTickQueue.splice(0, tickInfo[kIndex]);
tickInfo[kLength] = nextTickQueue.length;
}
}
tickInfo[kIndex] = 0;
}
Lets take a look at:
if (tickInfo[kLength] <= tickInfo[kIndex]) {
If the number of callbacks added is less than or equal to the just processed callbacks index this would mean that all of the callbacks in the queue have been processed and the following clause will make nextTickQueue point to an empty array and reset tickInfo[kLength] to zero. But if there are more callback in the queue than the just processed callbacks index the else clause will be taken:
nextTickQueue.splice(0, tickInfo[kIndex]);
tickInfo[kLength] = nextTickQueue.length;
splice will remove all elements from 0 to tickInfo[kIndex], which is removing all the processed callbacks. The new length is set as tickInfo[kLength]. This is done so that the nextTickQueue array does not become too large and run the process out of memory. By shortning the array this reduces the likelyhood of this happening.
Compiling with a different version of libuv
What I'd like to do is use my local fork of libuv instead of the one in the deps
directory. I think the way to do this is to make install
and then run configure with the following options:
$ ./configure --debug --shared-libuv --shared-libuv-includes=/usr/local/include
The location of the library is /usr/local/lib
, and /usr/local/include
for the headers on my machine.
Updating addons test
Some of the addons tests are not version controlled but instead generate using:
$ ./node tools/doc/addon-verify.js doc/api/addons.md
The source for these tests can be found in doc/api/addons.md
and these might need to be updated if a
change to all tests is required, for a concrete example we wanted to update the build/Release/addon directory
to be different depending on the build type (Debug/Release) and I forgot to update these tests.
Using nvm with Node.js source
Install to the nvm versions directory:
$ make install DESTDIR=~/.nvm/versions/node/ PREFIX=v8.0.0
You can then use nvm to list that version and versions:
$ nvm ls
v6.5.0
v7.0.0
v7.4.0
v8.0.0
$ nvm use 8
lldb
There is a .lldbinit which contains a number of useful alias to print out various V8 objects. This are most of the aliases defined in gdbinit.
For example, you can print a v8::Localv8::Function using the builtin print command:
(lldb) p init_fn
(v8::Local<v8::Function>) $3 = (val_ = 0x000000010484f900)
This does not give much, but if we instead use jlh:
(lldb) jlh init_fn
0x19417e265ba9: [Function]
- map = 0x382d7ba86ea9 [FastProperties]
- prototype = 0xd21f3203f39
- elements = 0x18ede4802241 <FixedArray[0]> [FAST_HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0x23c6dbac1ce1 <SharedFunctionInfo init>
- name = 0x21a813bbd419 <String[4]: init>
- formal_parameter_count = 4
- context = 0x19417e203b41 <FixedArray[8]>
- literals = 0x18ede4804a49 <FixedArray[1]>
- code = 0x1aa594184481 <Code: BUILTIN>
- properties = {
#length: 0x18ede4850bd9 <AccessorInfo> (accessor constant)
#name: 0x18ede4850c49 <AccessorInfo> (accessor constant)
#prototype: 0x18ede4850cb9 <AccessorInfo> (accessor constant)
}
So that gives us more information, but lets say you'd like to see the name of the function:
(lldb) jlh init_fn->GetName()
#init
Promise builtin
For debugging the builtin promise we are going to disable V8 snapshots:
$ ./configure --without-snapshot --debug
$ lldb -- out/Debug/node ../scripts/promise.js
(lldb) br s -f bootstrapper.cc -l 2321
It takes a while before the breakpoint is hit but it will be. And we are going to look at the following js:
const p = new Promise((resolve, reject) => {
resolve('ok');
});
Handle<JSFunction> promise_fun = InstallFunction(global,
"Promise", JS_PROMISE_TYPE, JSPromise::kSizeWithEmbedderFields,
0, factory->the_hole_value(), Builtins::kPromiseConstructor);
(lldb) expr isolate()->builtins()->builtin_handle(Builtins::Name::kPromiseConstructor)->Print()
0x3c5be1744d61: [Code]
- map: 0x3657d7704051 <Map(HOLEY_ELEMENTS)>
kind = BUILTIN
name = PromiseConstructor
compiler = turbofan
address = 0x3c5be1744d61
Body (size = 3644)
Instructions (size = 3320)
0x3c5be1744dc0 0 55 push rbp
0x3c5be1744dc1 1 4889e5 REX.W movq rbp,rsp
0x3c5be1744dc4 4 56 push rsi
0x3c5be1744dc5 5 57 push rdi
0x3c5be1744dc6 6 50 push rax
0x3c5be1744dc7 7 4883ec40 REX.W subq rsp,0x40
0x3c5be1744dcb b 4989e2 REX.W movq r10,rsp
0x3c5be1744dce e 4883ec08 REX.W subq rsp,0x8
0x3c5be1744dd2 12 4883e4f0 REX.W andq rsp,0xf0
0x3c5be1744dd6 16 4c891424 REX.W movq [rsp],r10
0x3c5be1744dda 1a 488bc2 REX.W movq rax,rdx
0x3c5be1744ddd 1d 488955e0 REX.W movq [rbp-0x20],rdx
0x3c5be1744de1 21 488bde REX.W movq rbx,rsi
0x3c5be1744de4 24 488bfe REX.W movq rdi,rsi
0x3c5be1744de7 27 48be000000001c000000 REX.W movq rsi,0x1c00000000
0x3c5be1744df1 31 48ba69bce15e57360000 REX.W movq rdx,0x36575ee1bc69 ;; object: 0x36575ee1bc69 <String[157]: CAST(Parameter(Linkage::GetJSCallContextParamIndex( static_cast<int>(call_descriptor->JSParameterCount())))) at ../deps/v8/src/compiler/code-assembler.cc:385>
0x3c5be1744dfb 3b 498d85185620fa REX.W leaq rax,[r13-0x5dfa9e8]
....
Handle<SharedFunctionInfo> shared(promise_fun->shared(), isolate);
shared->SetConstructStub(*BUILTIN_CODE(isolate, JSBuiltinsConstructStub));
shared->set_internal_formal_parameter_count(1);
shared->set_length(1);
InstallSpeciesGetter(promise_fun);
SimpleInstallFunction(promise_fun, "all", Builtins::kPromiseAll, 1, true);
SimpleInstallFunction(promise_fun, "race", Builtins::kPromiseRace, 1, true);
SimpleInstallFunction(promise_fun, "resolve", Builtins::kPromiseResolveTrampoline, 1, true);
SimpleInstallFunction(promise_fun, "reject", Builtins::kPromiseReject, 1, true);
So we can see that the Promise function is set up as a global. The then
function is later setup using:
Handle<JSFunction> promise_then = SimpleInstallFunction(prototype, isolate->factory()->then_string(), Builtins::kPromisePrototypeThen, 2, true);
native_context()->set_promise_then(*promise_then);
deps/v8/src/builtins/builtins-definitions.h
:
TFJ(PromiseConstructor, 1, kExecutor)
TFJ means TurboFan JavaScript linkage and means it is callable as a JavaScript function.
Debug JavaScript tests
Just example commands that I use in different projects to run the debugger with different test suites.
Mocha
$ mocha --inspect --debug-brk -u exports --recursive -t 10000 ./test/setup.js test/sync/test_index.js
crypto
The current version of openssl is 1.0.2k (run process.versions.openssl). So this is major version 1, minor 0 and patch 2k I guess. To investigate lets take a look what happens when one requires crypto:
const crypto = require('crypto');
$ lldb -- ./out/Debug/node crypto.js
src/node_crypto.cc is a builtin:
NODE_MODULE_CONTEXT_AWARE_BUILTIN(crypto, node::crypto::InitCrypto)
So, lets set a breakpoint in InitCrypto
:
(lldb) breakpoint set -f node_crypto.cc -l 6007
(lldb) breakpoint set -f node_crypto.cc -l 5880
First things that happens is that libcrypto must be initializes. My understanding is that OpenSSL has two libraries which are libssl which used libcrypto. Node is using libcrypto in this case.
From InitCryptOnce:
SSL_load_error_strings();
OPENSSL_no_config();
OPENSSL_no_config() marks OpenSSL as configured. It seems that if this is not done to avoid some of OpenSSLs standard init functions that automatically call the configuration to just return hence do nothing.
if (!openssl_config.empty())
This path would be taken if a openssl config had been passed using --openssl-config
.
Next, we have:
SSL_library_init();
This function can be found in deps/openssl/openssl/ssl/ssl_algs.c
EVP_add_cipher(EVP_des_cbc());
EVP I think stands for envelope and has a number of high level cryptographic functions.
CNNIC
China Internet Network Information Center (CNNIC) is referenced in some code. It is a Certificate Authority (may be other things as well).
Signed Public Key and Challenge (SPKAC)
Also known as Netscape SPKI (spooky). There was originally an element named keygen in the html5 spec which was later removed. The intention was to create client side certificates through a web service for protocols like WebID.
Building with shared openssl
Building an locally built version of OpenSSL.
$ ./configure --debug --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/build_1_1_0g/lib --shared-openssl-includes=/Users/danielbevenius/work/security/build_1_1_0g/include
$ make -j8
Building OpenSSL without elliptic curve support
$ ./Configure no-ec --debug --prefix=/Users/danielbevenius/work/security/openssl/build --libdir="openssl" darwin64-x86_64-cc
Then building Node against that version:
$ ./configure --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/openssl/build/openssl --shared-openssl-includes=/Users/danielbevenius/work/security/openssl/build/include
This will not compile are the headers for ec will not exist in the --shared-openssl-includes
directory. You'll have to use the source
include directory instead so that all the headers can be found.
$ ./configure --debug --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/openssl/build/openssl --prefix=/Users/danielbevenius/work/nodejs/build --shared-openssl-includes=/Users/danielbevenius/work/security/openssl/include
There will instead be a runtime error if you try to call functions that require EC but you'll be able to build.
Notice here that we have configured OpenSSL without Elliptic curve support
I was wondering what the values will if you just specify --shared-openssl
which I've seen. In this case pkg-config
will be called to retrieve information about
install libraries in the system. For my system this will be:
'libraries': ['-lcrypto', '-lssl']},
RHEL8
OpenSSL on RHEL8 will be OpenSSL 1.1.1 and TLS 1.3. For node this will have implications if node is built and dynamically linking OpenSSL.
Is the OpenSSL version that RHEL ships the same as the upstream one?
If there is no difference than perhaps using different certificates or something,
I think we should be using the statically linked version of OpenSSL. I know that
previously dynamically linking was argued as better as that would mean that only
the OpenSSL dynamic library would have to be updated in the case of a CVE. But
in node's case it is very tightly coupled with OpenSSL and changes are that there
are code changes in node core required. So just updating the dynamic OpenSSL
library might break a node application. So in reality if a CVE coming out for OpenSSL the node
package/executable will also have to be updated. Also, at least for us with
in rhoar our target is a container we would have to update the base image in
addition to an update to node..
Building on Solaris
I used VirtualBox to build and run the test suite on Solaris
$ uname -a
SunOS solaris 5.11 11.3 i86pc i386 i86pc
Setup
$ sudo pkgadd -d http://get.opencsw.org/now
Install git:
$ sudo /opt/csw/bin/pkgutil -y -i git
$ export PATH=/opt/csw/bin:$PATH // added to ~/.bashrc
Install binutils:
$ sudo pkgutil -y -i binutils
$ export PATH=/opt/csw/bin:/opt/csw/gnu:$PATH
Set GNU Make as the default:
$ sudo ln -s /usr/bin/gmake /usr/bin/make
Clone node:
$ git clone https://github.com/nodejs/node.git
Install gcc 49:
$ sudo pkg install --accept --license gcc-49
$ gmake CXXFLAGS+="--function-sections -fdata-sections"
Install stdc++6:
$ sudo pkgchk -L CSWlibstdc++6
$ export LD_LIBRARY_PATH=/opt/csw/lib/:$LD_LIBRARY_PATH
Patches:
danbev@solaris:~/work/node$ git show 8ff7afd
commit 8ff7afd2aba1cc13348e5d639f292b2fbb3b86d0
Author: Daniel Bevenius <daniel.bevenius@gmail.com>
Date: Thu Mar 16 06:59:05 2017 +0100
add include ldflag for solaris
It looks like when cctest is to be compiled the includes are
missing. Trying to add the SHARED_INTERMEDIATE_DIR and see if
that fixes one of the includes. If that works I'll add the rest
if required.
diff --git a/node.gyp b/node.gyp
index 2407844..4bd3769 100644
--- a/node.gyp
+++ b/node.gyp
@@ -667,6 +667,9 @@
]},
],
}],
+ ['OS=="solaris"', {
+ 'ldflags': [ '-I<(SHARED_INTERMEDIATE_DIR)' ]
+ }],
]
}
], # end targets
danbev@solaris:~/work/node$ git show 68c396b
commit 68c396be00896801bb92b5705da240d9aef30890
Author: Daniel Bevenius <daniel.bevenius@gmail.com>
Date: Thu Mar 16 12:21:24 2017 +0100
fix for building on solaris
diff --git a/common.gypi b/common.gypi
index 3aad8e7..e001a1d 100644
--- a/common.gypi
+++ b/common.gypi
@@ -282,7 +282,8 @@
}],
[ 'OS in "linux freebsd openbsd solaris android aix"', {
'cflags': [ '-Wall', '-Wextra', '-Wno-unused-parameter', ],
- 'cflags_cc': [ '-fno-rtti', '-fno-exceptions', '-std=gnu++0x' ],
+ #'cflags_cc': [ '-fno-rtti', '-fno-exceptions', '-std=gnu++0x' ],
+ 'cflags_cc': [ '-fno-rtti', '-fno-exceptions' ],
'ldflags': [ '-rdynamic' ],
'target_conditions': [
# The 1990s toolchain on SmartOS can't handle thin archives.
@@ -323,6 +324,8 @@
'ldflags': [ '-m64', '-march=z196' ],
}],
[ 'OS=="solaris"', {
+ 'defines': ['_GLIBCXX_USE_C99_MATH'],
+ 'cflags_cc': [ '-std=c++11' ],
'cflags': [ '-pthreads' ],
'ldflags': [ '-pthreads' ],
'cflags!': [ '-pthread' ],
Build and run node tests:
$ gmake cctest
Building on Windows
I used VirtualBox to build and run the test suite on Windows
$ .\vcbuild test
Debugging
$ lldb -- out/Debug/cctest
(lldb) r
[ RUN ] EnvironmentTest.MultipleEnvironmentsPerIsolate
cctest(86419,0x7fff799ca000) malloc: *** error for object 0x104401058: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Process 86419 stopped
* thread #1: tid = 0x2bd84d0, 0x00007fff97229f06 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
frame #0: 0x00007fff97229f06 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
-> 0x7fff97229f06 <+10>: jae 0x7fff97229f10 ; <+20>
0x7fff97229f08 <+12>: movq %rax, %rdi
0x7fff97229f0b <+15>: jmp 0x7fff972247cd ; cerror_nocancel
0x7fff97229f10 <+20>: retq
(lldb) memory read 0x104401058
(lldb) jlh handle
0x1872340551b1: [Symbol]
- hash: 60730982
- name: 0x187234028391 <String[14]: node:npnBuffer>
- private: 1
(lldb) jlh result
0x1e9a8f728239: [Symbol]
- hash: 171677908
- name: 0x1e9a8f728211 <String[15]: node:alpnBuffer>
- private: 1
Switching between clang and gcc
CXX=g++ CXX.host=g++ && ./configure -- -Dclang=0.
Ninja
Macosx:
$ ./configure --ninja
$ ninja -C out/Release
$ ./configure && tools/gyp_node.py -f ninja && ninja -C out/Release && ln -fs out/Release/node node
This will generate object files in out/Release/src/
but the names will be obj/src/node.node.o
instead of obj/src/node/node.o
. This matters
as we generete object files for different operating systems.
If you take a look in out/Release/ninja.build you'll find a bunch of subninja commands which are used to include other ninja build files:
....
subninja obj/node.ninja
Building with Ninja linux
$ dnf install ninja-build
$ ./configure --debug --ninja
$ ninja-build -C out/Release
$ out/Release/cctest
Building with Ninja on windows
You'll need to install Visual Studion 2015 and make sure you select Common Tools for C++.
Open cmd with administrator priveliges (Start -> Search for cmd ->CTRL+SHIFT+ENTER):
> python configure --debug --ninja --dest-cpu=x86 --without-intl
> tools\gyp_node.py -f ninja
> ninja -C out\Release
> out\Release\cctest.exe
Linux getauxval
A good description of this can be found here: https://lwn.net/Articles/519085/
The getauxval is a function that was added in glibc 2.16
#if defined(__linux__) && defined(__GLIBC__) && defined(__GLIBC_PREREQ)
# if __GLIBC_PREREQ(2, 16)
# define HAS_GETAUXVAL 1
# include <sys/auxv.h>
# endif // HAS_GETAUXVAL
#endif
# LD_SHOW_AUXV=1 ./node
AT_SYSINFO_EHDR: 0x7fff503ee000
AT_HWCAP: 9f8bfbff
AT_PAGESZ: 4096
AT_CLKTCK: 100
AT_PHDR: 0x400040
AT_PHENT: 56
AT_PHNUM: 9
AT_BASE: 0x7f223f1d6000
AT_FLAGS: 0x0
AT_ENTRY: 0x859c90
AT_UID: 0
AT_EUID: 0
AT_GID: 0
AT_EGID: 0
AT_SECURE: 0
AT_RANDOM: 0x7fff503dddc9
AT_EXECFN: ./node
AT_PLATFORM: x86_64
To force the setting of AT_SECURE:
$ setcap cap_net_raw+ep out/Release/node
$ getcap out/Release/node
$ useradd beve
$ su beve
$ ./out/Release/node
Show all v8 exceptions
$ out/Release/node --print_all_exceptions /work/node/test/parallel/test-process-setuid-setgid.js
Creating patches
I've found that I need to create patches from old patches that worked with a previous version. At work we apply patches to node sources and when new versions are released these patches might not apply cleanly making it a manual task to make updates to that tag and then generating the patches to be applied.
$ patch -p1 < 000x-something.patch
$ git diff --patch-with-stat > patch.out
Then I just copy this and replace the original patch section, but keeping the rest of the original patch. Then try to apply the patch again to make sure it applied cleanly
Cherry pick a V8 commit
$ git format-patch -1 --stdout f5fad6d > out.patch
$ git am --directory deps/v8 ~/work/google/javascript/v8/verbose.patch
Don't forget to bump the patch version in deps/v8/include/v8-version.h Also common common.gypi needs to be updated as well:
'v8_embedder_string': '-node.5',
HTTP/2
const server = h2.createServer();
createServer
is defined in lib/internal/http2/core.js
and it returns a
new Http2Server.
Http2Server extends Server in net.js. After calling the super classes constructor
the following call is performed:
this.on('newListener', setupCompat);
setupCompat is a event listener callback which only handles 'request' events, in which case it
Messages/Exceptions with V8
Each V8 Isolate allows listeners to be added for various error messages/exceptions.
isolate->AddMessageListener(OnMessage);
isolate->SetFatalErrorHandler(OnFatalError);
isolate->SetAbortOnUncaughtExceptionCallback(ShouldAbortOnUncaughtException);
AddMessageListener
is a callback that will be called when an error occurs. How does this work then?
When a script is run, for example:
const char *js = "age = ajj40"; // intentionally trigger an exception
Local<String> source = String::NewFromUtf8(isolate, js, NewStringType::kNormal).ToLocalChecked();
Local<Script> script = Script::Compile(context, source).ToLocalChecked();
Local<Value> result = script->Run(context).ToLocalChecked();
Script::Run can be found in api.cc
(lldb) br s -f api.cc -l 2017
has_pending_exception = !ToLocal<Value>(i::Execution::Call(isolate, fun, receiver, 0, nullptr), &result);
See the returned value is a bool indicating if there was an error.
i::Execution::Call
will call:
return CallInternal(isolate, callable, receiver, argc, argv, MessageHandling::kReport)
Notice that in the MessageHandling::kReport being passed in. CallInternal looks like this (in our case):
return Invoke(isolate, false, callable, receiver, argc, argv, isolate->factory()->undefined_value(), message_handling);
MUST_USE_RESULT MaybeHandle<Object> Invoke(
Isolate* isolate, bool is_construct, Handle<Object> target,
Handle<Object> receiver, int argc, Handle<Object> args[],
Handle<Object> new_target, Execution::MessageHandling message_handling)
In our case the target is a JavaScript Function which can be checked by:
(lldb) p target->IsJSFunction()
(bool) $58 = true
This will cause us to enter the following block in the Invoke function:
if (target->IsJSFunction()) {
Handle<JSFunction> function = Handle<JSFunction>::cast(target);
But there is a check:
if ((!is_construct || function->IsConstructor()) &&
function->shared()->IsApiFunction()) {
which causes us to exit the block.
// Placeholder for return value.
Object* value = NULL;
typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
Object* receiver, int argc,
Object*** args);
So we are declaring a JSEntryFunction which is of type pointer to function that returns a pointer to Object and takes the arguments listed.
Handle<Code> code = is_construct
? isolate->factory()->js_construct_entry_code()
: isolate->factory()->js_entry_code();
How are js_entry_code() set?
See Heap objects
for details.
JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
if (FLAG_clear_exceptions_on_js_entry) isolate->clear_pending_exception();
// Call the function through the right JS entry stub.
Object* orig_func = *new_target;
Object* func = *target;
Object* recv = *receiver;
Object*** argv = reinterpret_cast<Object***>(args);
if (FLAG_profile_deserialization && target->IsJSFunction()) {
PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
}
RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
(lldb) br s -f execution.cc -l 145
CALL_GENERETED_CODE is a macro which is used to call code generated by one of the compilers. I'm on a x64 some guessing that src/x64/simulator-x64.h is getting called which does:
// TODO(X64): Don't pass p0, since it isn't used?
#define CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
(entry(p0, p1, p2, p3, p4))
Setting a breakpoint after this call we can inspect the value returned:
(lldb) job value
#exception
// Update the pending exception flag and return the value.
bool has_exception = value->IsException(isolate);
DCHECK(has_exception == isolate->has_pending_exception());
if (has_exception) {
if (message_handling == Execution::MessageHandling::kReport) {
isolate->ReportPendingMessages();
}
return MaybeHandle<Object>();
} else {
isolate->clear_pending_message();
}
Remember that we passed Execution::MessageHandling::kReport earlier so we will call isolate-ReportPendingMessages()
void Isolate::ReportPendingMessages() {
DCHECK(AllowExceptions::IsAllowed(this));
Object* exception = pending_exception();
pending_exception() performs some checks and then returns:
return thread_local_top_.pending_exception_;
(lldb) job exception
0x94c7c108af1: [JS_ERROR_TYPE]
- map = 0xcc576b8e179 [FastProperties]
- prototype = 0x1993f4b15221
- elements = 0x37d706602241 <FixedArray[0]> [FAST_HOLEY_SMI_ELEMENTS]
- properties = 0x94c7c108ec9 <FixedArray[3]> {
#stack: 0x3aeec9e69bf1 <AccessorInfo> (const accessor descriptor)
#message: 0x94c7c108ac9 <String[20]: ajj40 is not defined> (data field 0)
0x37d706604d71 <Symbol: stack_trace_symbol>: 0x94c7c109099 <JSArray[6]> (data field 1)
// Try to propagate the exception to an external v8::TryCatch handler. If
// propagation was unsuccessful, then we will get another chance at reporting
// the pending message if the exception is re-thrown.
bool has_been_propagated = PropagatePendingExceptionToExternalTryCatch();
if (!has_been_propagated) return;
Looking into PropagatePendingExceptionToExternalTryCatch:
bool Isolate::PropagatePendingExceptionToExternalTryCatch() {
Object* exception = pending_exception();
The first thing it does is again call pending_exceptions() invoking the same checks are before which should really be the same. I wonder if there might be use for an overloaded function taking a pointer to exception?
HandleScope scope(this);
Handle<JSMessageObject> message(JSMessageObject::cast(message_obj), this);
Handle<JSValue> script_wrapper(JSValue::cast(message->script()), this);
Handle<Script> script(Script::cast(script_wrapper->value()), this);
int start_pos = message->start_position();
int end_pos = message->end_position();
MessageLocation location(script, start_pos, end_pos);
MessageHandler::ReportMessage(this, &location, message);
MessageHandler::ReportMessage
prepares the error message (very simplified) and ends up calling:
ReportMessageNoExceptions(isolate, loc, message, api_exception_obj);
v8::MessageCallback callback =FUNCTION_CAST<v8::MessageCallback>(callback_obj->foreign_address());
Handle<Object> callback_data(listener->get(1), isolate);
{
// Do not allow exceptions to propagate.
v8::TryCatch try_catch(reinterpret_cast<v8::Isolate*>(isolate));
callback(api_message_obj, callback_data->IsUndefined(isolate) ? api_exception_obj : v8::Utils::ToLocal(callback_data));
}
This is the call that invokes are message callback. After this we will back out of the frame stacks. On thing I notice that there was an ExceptionScope which the deconstructor sets the exception object. Need to look into why this is done. So we will eventually end up back where in v8::Script::Run:
has_pending_exception = !ToLocal<Value>(i::Execution::Call(isolate, fun, receiver, 0, nullptr), &result);
(lldb) p has_pending_exception
(bool) $208 = true
RETURN_ON_FAILED_EXECUTION(Value);
RETURN_ESCAPED(result);
Lets take a closer look at RETURN_ON_FAILED_EXECUTION(Value):
#define RETURN_ON_FAILED_EXECUTION(T) \
EXCEPTION_BAILOUT_CHECK_SCOPED(isolate, MaybeLocal<T>())
So that will be EXCEPTION_BAILOUT_CHECK_SCOPED(isolate, MaybeLocal<Value>()) which becomes:
#define EXCEPTION_BAILOUT_CHECK_SCOPED(isolate, value) \
do { \
if (has_pending_exception) { \
call_depth_scope.Escape(); \
return value; \
} \
} while (false)
So in our case the value returned will be a MaybeLocal<Value> object. This is what gets returned here:
Local<Value> result = script->Run(context).ToLocalChecked();
Now, notice that we are calling ToLocalChecked(), which
if (V8_UNLIKELY(val_ == nullptr)) V8::ToLocalEmpty();
Utils::ApiCheck(false, "v8::ToLocalChecked", "Empty MaybeLocal.");
if (!condition) Utils::ReportApiFailure(location, message);
FatalErrorCallback callback = isolate->exception_behavior();
callback(location, message);
And this is where our OnFatalError function is called. But without the ToLocalChecked our OnFatalError would not be called.
Well, during invokation of a Script, recall from above it was mentioned that we'll end up
in CALL_GENERATED_CODE
. Now, if we set a breakpoint in file = 'isolate.cc', line = 1062
which is the Throw function we can backtrace and see how we ended up there.
ic.cc:2395:
v8::internal::Runtime_LoadGlobalIC_Miss
Handle<Object> result;
ASSIGN_RETURN_FAILURE_ON_EXCEPTION(isolate, result, ic.Load(name));
return *result;
MaybeHandle<Object> LoadIC::Load(Handle<Object> object, Handle<Name> name)
This will throw an ReferenceError in our case:
return ReferenceError(name);
THROW_NEW_ERROR(isolate(), NewReferenceError(MessageTemplate::kNotDefined, name), Object);
That call will bring us back to isolate.cc (1064) which is the Throw function we breaked in. This time we have a try_catch_handler:
(lldb) p *try_catch_handler()
(v8::TryCatch) $2 = {
isolate_ = 0x0000000105000000
next_ = 0x0000000000000000
exception_ = 0x00003107fe782351
message_obj_ = 0x00003107fe782351
js_stack_comparable_address_ = 0x00007fff5fbfea50
is_verbose_ = true
can_continue_ = true
capture_message_ = true
rethrow_ = false
has_terminated_ = false
}
ReportPendingMessages:
// Determine whether the message needs to be reported to all message handlers
// depending on whether and external v8::TryCatch or an internal JavaScript
// handler is on top.
bool should_report_exception;
if (IsExternalHandlerOnTop(exception)) {
// Only report the exception if the external handler is verbose.
should_report_exception = try_catch_handler()->is_verbose_;
} else {
// Report the exception if it isn't caught by JavaScript code.
should_report_exception = !IsJavaScriptHandlerOnTop(exception);
}
Is this case we have a TryCatch RAII and have set verbose to true. There is no javascript try/catch so the MessageHandler we set by calling AddMessageHandler on the isolate should be the target handler.
SetFatalErrorHandler
If this callback handler is not set then the default behaviour in V8 will be to
print a stactrace and exit, for example:
$ ./exceptions
OnMessage...
#
# Fatal error in v8::ToLocalChecked
# Empty MaybeLocal.
#
Received signal 4 <unknown> 000109100351
==== C stack trace ===============================
[0x00010910322e]
[0x000109103265]
[0x000109103186]
[0x7fff8a2c252a]
[0x000107ac8423]
[0x00010624ffbc]
[0x000106254bef]
[0x000106254c1d]
[0x00010621d4dc]
[0x7fff86cc45ad]
[0x000000000001]
[end of stack trace]
Illegal instruction: 4
You can set a handler for this that does something before exiting of choose to not exit.
When an exceptions is raised in V8 it will cause the function Throw in e
Object* Isolate::Throw(Object* exception, MessageLocation* location) {
TryCatch
Is used to create a try/catch block in V8 which used RAII:
{
TryCatch try_catch(isolate);
try_catch.SetVerbose(true);
... perform operations
if (try_catch.HasCaught()) {
printf("Caught: %s\n", *String::Utf8Value(try_catch.Exception()));
}
How does setting verbose
affect things?
In Isolate::Throw:
Which can be called if there is a JavaScript error and if an external TryCatch exist and
verbose is true. If that is the case the Message will be created:
Handle<Object> message_obj = CreateMessage(exception_handle, location);
thread_local_top()->pending_message_obj_ = *message_obj;
...
set_pending_exception(*exception_handle);
return heap()->exception();
So that is what throw does, sets the message_obj and the exception_handle and then returns.
What if we set verbose to false:
(lldb) expr try_catch.SetVerbose(false);
In this case the message will still be created and the exception set on the thread_local, but when we get to ReportPendingMessages is_verbose_ will be false and the callback will be skipped.
In src/api.cc
(line 2710) we can find the TryCatch constructor:
v8::TryCatch::TryCatch(v8::Isolate* isolate)
: isolate_(reinterpret_cast<i::Isolate*>(isolate)),
next_(isolate_->try_catch_handler()),
is_verbose_(false),
can_continue_(true),
capture_message_(true),
rethrow_(false),
has_terminated_(false) {
ResetInternal();
...
isolate_->RegisterTryCatchHandler(this);
}
`ResetInternal` will do the following:
```c++
i::Object* the_hole = isolate_->heap()->the_hole_value();
exception_ = the_hole;
message_obj_ = the_hole;
(lldb) p the_hole
(v8::internal::Object *) $4 = 0x0000278095682321
`RegisterTryCatchHandler` will do:
```c++
thread_local_top()->set_try_catch_handler(that);
After this call we can inspect thread_local_top_:
(lldb) p *thread_local_top()
(v8::internal::ThreadLocalTop) $22 = {
isolate_ = 0x0000000104806e00
context_ = 0x000025b6fbf03af1
thread_id_ = (id_ = 1)
pending_exception_ = 0x000025b693f02321
wasm_caught_exception_ = 0x0000000000000000
pending_handler_context_ = 0x0000000000000000
pending_handler_code_ = 0x0000000000000000
pending_handler_offset_ = 0
pending_handler_fp_ = 0x0000000000000000 <no value available>
pending_handler_sp_ = 0x0000000000000000 <no value available>
rethrowing_message_ = false
pending_message_obj_ = 0x000025b693f02321
scheduled_exception_ = 0x000025b693f02321
external_caught_exception_ = false
save_context_ = 0x0000000000000000
c_entry_fp_ = 0x0000000000000000 <no value available>
handler_ = 0x0000000000000000 <no value available>
c_function_ = 0x0000000000000000 <no value available>
promise_on_stack_ = 0x0000000000000000
js_entry_sp_ = 0x0000000000000000 <no value available>
external_callback_scope_ = 0x0000000000000000
current_vm_state_ = EXTERNAL
failed_access_check_callback_ = 0x0000000000000000
try_catch_handler_ = 0x00007fff5fbfde60
}
Notice that pending_exception
, pending_message_obj_
, and scheduled_exception_
are all set to the_hole
. The try_catch_handler_
is also
set.
So these are memory locations, and from what I understand so far is that these are made available to generated code, enabling them
to set values, like pending_message_obj_, but also have access to the try_catch_handler_.
This means that a generated piece of code would directly call/jump to try_catch_handler_ and have it execute. But what sets this up?
When CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
is called the stub_entry will point to the code generated
for JSEntryStub
. This will place the correct values in the expected registers.
For example, take the following from the generated assemble (created using (lldb) job *code):
movq r10,0x104808790 ;; external reference (Isolate::context_address)
movq r10,0x104808790 ;; external reference (Isolate::context_address)
This instruction is generated by src/x64/code-stubs-x64.cc
JSEntryStub::Generate
:
ExternalReference context_address(IsolateAddressId::kContextAddress, isolate());
__ Load(kScratchRegister, context_address);o
__ Push(kScratchRegister); // context
The above will set ExternalReference
address_:
ExternalReference::ExternalReference(IsolateAddressId id, Isolate* isolate)
: address_(isolate->get_address_from_id(id)) {}
And get_address_from_id
look like this:
Address Isolate::get_address_from_id(IsolateAddressId id) {
return isolate_addresses_[id];
}
Move(kScratchRegister, source);
(lldb) expr kScratchRegister
(const v8::internal::Register) $7 = {
v8::internal::RegisterBase<v8::internal::Register, 16> = (reg_code_ = 10)
}
We can find the following function in src/x64/macro-assembler-x64.h
:
void Move(Register dst, ExternalReference ext) {
movp(dst, reinterpret_cast<void*>(ext.address()), RelocInfo::EXTERNAL_REFERENCE);
}
Notice the RelocInfo::EXTERNAL_REFERENCE
which indicates that this is the address of an
external C++ function
(lldb) p ext.address()
(v8::internal::Address) $8 = 0x0000000105813b50 <no value available>
To understand how this works we can set a break point and run mkshnapshot:
$ $ lldb out.gn/learning/mksnapshot
(lldb) br s -f code-stubs-x64.cc -n JSEntryStub::Generate
(lldb) expr StackFrame::TypeToMarker(type())
(int32_t) $1 = 2 // pushq $0x2
0x3db66db84066 REX.W movq r10,0x104808790 ;; external reference (Isolate::context_address)
0x3db66db84070: movq (%r10), %r10
Notice that 0x104808790
is a pointer. This is then dereferenced and that value placed into r10:
(lldb) register read r10
r10 = 0x0000000104808790
(lldb) memory read -f x -c 1 -s 8 `($r10)`
0x104808790: 0x000025b6fbf03af1
So it contains the value 0x000025b6fbf03af1
and if we look at the context_
entry from above we can see
these match:
context_ = 0x000025b6fbf03af1
One thing I noticed in x64/code-stubs-x64.cc JSEntry function was:
__ bind(&handler_entry);
handler_offset_ = handler_entry.pos();
Now, handler_offset_
is a private member of JSEntryStub
in src/code-stubs.h
. This field is set
by FinishCode
:
Handle<Code> new_object = GenerateCode();
new_object->set_stub_key(GetKey());
FinishCode(new_object);
Now, handler_entry
is a label and this gets the position of the label. I'm guessing (at the moment) that the handler_offet_
allows other
generated functions to addess the exception handler.
deps/v8/src/x64/macro-assembler-x64.cc EnterFrame
. Just note that when Builtins::Generate_JSEntryTrampoline generates
code it calls EnterFrame
which does more than store/set rbp. Remember this or the code will be hard to follow.
Compiling on windows
I looking into an issue on windows: https://github.com/nodejs/node/issues/12952
The issue is a link error with the cctest target. In this particular case on windows it was built using:
.\vcbuild.bat dll debug x64 vc2015
This will result in the following options:
```console
configure --debug --shared --dest-cpu=x64 --tag=
So this is saying that node should be created as a shared library but not any of its dependencies. In this case the focus is on OpenSSL which will be a static library. The link error is:
openssl.lib(err.obj) : error LINK2005: ERR_put_error already defined in node.lib(node.dll
We can find the exported symbols of Debug/node.dll using:
dumpbin /EXPORTS Debug/node.dll > node-exports
We can find the ERR_put_error is indeed exported from node.dll This is done because in node.gyp we have:
[ 'OS=="win" and '
'node_use_openssl=="true" and '
'node_shared_openssl=="false" and node_shared=="false"', {
'use_openssl_def': 1,
}, {
'use_openssl_def': 0,
}],
In our case use_openssl_def will be 1 which is the used in node.gypi:
# openssl.def is based on zlib.def, zlib symbols
# are always exported.
['use_openssl_def==1', {
'sources': ['<(SHARED_INTERMEDIATE_DIR)/openssl.def'],
}],
['OS=="win" and use_openssl_def==0', {
'sources': ['deps/zlib/win32/zlib.def'],
A .def file is a module definition file which describes attributes of a DLL. In our case openssl.def can be found in Debug\obj\global_intermediate:
EXPORTS
...
ERR_put_error
So we can see this function is indeed exported, so if we link against openssl.lib this symbol (and the others) will be duplicates.
V8 Platform
This is used by an embedder and is an interface that must be implemented. This interface is defined in deps/v8/include/v8-platform.h. This interface has a number of functions related to threading and running task on threads:
- CallOnBackgroundThread
- CallOnForegroundThread
- CallDelayedOnForegroundThread ...
- AddTraceEvent
Node can be configured --without-v8-platform which will set environment variables that will be picked up by the pre-processor. For example, if you take a look in src/node.cc you will find:
#if NODE_USE_V8_PLATFORM
#include "libplatform/libplatform.h"
#endif // NODE_USE_V8_PLATFORM
libplatform/libplatform.h is the interface for a default v8::Platform implementation in V8. Next, we have a v8_platform struct that will use the above default v8::Platform implementation is NODE_USE_V8_PLATFORM is set. But what does it mean to not use the default v8::Platform implementation? Basically this will just create noop functions or functions that throw an error:
#else // !NODE_USE_V8_PLATFORM
void Initialize(int thread_pool_size) {}
void PumpMessageLoop(Isolate* isolate) {}
void Dispose() {}
bool StartInspector(Environment *env, const char* script_path,
const node::DebugOptions& options) {
env->ThrowError("Node compiled with NODE_USE_V8_PLATFORM=0");
return true;
}
void StartTracingAgent() {
fprintf(stderr, "Node compiled with NODE_USE_V8_PLATFORM=0, "
"so event tracing is not available.\n");
}
void StopTracingAgent() {}
When trying to run a test --without-v8-platform I run into the following error:
#
# Fatal error in ../deps/v8/src/v8.cc, line 110
# Check failed: platform_.
#
==== C stack trace ===============================
0 node 0x00000001019d2dae v8::base::debug::StackTrace::StackTrace() + 30
1 node 0x00000001019d2de5 v8::base::debug::StackTrace::StackTrace() + 21
2 node 0x00000001019ccf84 V8_Fatal + 452
3 node 0x00000001012bcea8 v8::internal::V8::GetCurrentPlatform() + 72
4 node 0x0000000100ed0c7c v8::internal::Isolate::Init(v8::internal::Deserializer*) + 1692
5 node 0x00000001012965ba v8::internal::Snapshot::Initialize(v8::internal::Isolate*) + 170
6 node 0x0000000100260265 v8::Isolate::New(v8::Isolate::CreateParams const&) + 485
7 node 0x000000010170901f node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 79
8 node 0x0000000101708bce node::Start(int, char**) + 478
9 node 0x000000010175f21e main + 94
10 node 0x0000000100000a34 start + 52
11 ??? 0x0000000000000002 0x0 + 2
Process 9882 stopped
2644 compiler_dispatcher_ =
2645 new CompilerDispatcher(this, V8::GetCurrentPlatform(), FLAG_stack_size);
v8::GetCurrentPlatform will perform the following:
v8::Platform* V8::GetCurrentPlatform() {
DCHECK(platform_);
return platform_;
}
But since we have not set a platform, remember that would have been done if NODE_USE_V8_PLATFORM:
#if NODE_USE_V8_PLATFORM
void Initialize(int thread_pool_size) {
platform_ = v8::platform::CreateDefaultPlatform(thread_pool_size);
V8::InitializePlatform(platform_);
tracing::TraceEventHelper::SetCurrentPlatform(platform_);
}
My understanding is that the in this code path the snapshot blob is being deserialized and when this is done.
Upgrading OpenSSL
Download and verify the download:
$ gpg openssl-1.0.2l.tar.gz.asc
gpg: Signature made Thu May 25 14:55:41 2017 CEST using RSA key ID 0E604491
gpg: Can't check signature: public key not found
$ gpg --keyserver pgpkeys.mit.edu --recv-key 0E604491
gpg: requesting key 0E604491 from hkp server pgpkeys.mit.edu
gpg: key 0E604491: public key "Matt Caswell <matt@openssl.org>" imported
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
$ gpg openssl-1.0.2l.tar.gz.asc
gpg: Signature made Thu May 25 14:55:41 2017 CEST using RSA key ID 0E604491
gpg: Good signature from "Matt Caswell <matt@openssl.org>"
gpg: aka "Matt Caswell <frodo@baggins.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 8657 ABB2 60F0 56B1 E519 0839 D9C4 D26D 0E60 4491
Compare the changes to make see if you have to make changes to the
OpenSSL with FIPS
When building the OpenSSL and specifying --openssl-fips
, which is described as
"Build OpenSSL using FIPS canister .o file in supplied folder"
Fipsld.
The canister file is just the collection of the all the source code into an single monolihic object module to guarentee the relative order of the original code components:
$ ld -r -o fipscanister.o fips_start.o ... fips_end.o
Doing this will preserve the relative order of the original code components.
The follwing can be seen on that page: "When performing a static link against the OpenSSL library, you have to embed the expected FIPS signature in your executable after final linking. Embedding the FIPS signature in your executable is most often accomplished with fisld." "fisld will take the place of the linker (or compiler if invoking via a compiler driver). If you use fisld to compile a source file, fisld will do nothing and simply invoke the compiler you specify through FIPSLD_CC. When it comes time to link, fisld will compile fips_premain.c, add fipscanister.o, and then perform the final link of your program. Once your program is linked, fisld will then invoke incore to embed the FIPS signature in your program."
But how about when we dynamically link to an OpenSSL version that has FIPS support
enabled. Since we would not be statically linking we should not have to specify
the fipscanister.o. But without specifying the --openssl-fips
flag the options
to enable/disable FIPS are not avaiable in node (nore are functions enabled/disabled
that would otherwise be when when FIPS mode is enabled).
Perhaps we can set the macro NODE_FIPS_MODE
if OPENSSL_FIPS
is set which should
be done when linking against an OpenSSL version that support FIPS?
For instructions building an OpenSSL version with FIPS support see: https://github.com/danbev/learning-libcrypto#fips
Detecting FIPS support: Currently, FIPS support is not available in the version that is shipped with Node.js. But we still have the option to dynamically link to a FIPS compatible OpenSSL library. This is for example what we do at Red Hat. There is a problem here though...
If we run the following command on default build (statically linking with OpenSSL in deps, so no FIPS support):
$ ./node -p "require('crypto').getFips()"
0
This expected.
Node.js can also be dynamically linked with OpenSSL, for example:
$ ./configure --shared-openssl --openssl-is-fips
The --openssl-is-fips
option specifies that the OpenSSL library is FIPS compatible
and this enables FIPS related functionality in Node to be available.
Building and running the same command on this version gives:
$./node -p "require('crypto').getFips()"
0
Now this is not expected.
None-FIPS configuration:
$ node -p 'process.config.variables' | grep openssl
node_shared_openssl: false,
node_use_openssl: true,
openssl_fips: '',
openssl_is_fips: false,
FIPS Compatible configuration:
./node -p 'process.config.variables' | grep openssl
node_shared_openssl: true,
node_use_openssl: true,
openssl_fips: '',
openssl_is_fips: true,
openssl_system_ca_path: '/etc/pki/tls/certs/ca-bundle.crt',
./node --expose-internals -p "require('internal/test/binding').internalBinding('config')"
{
isDebugBuild: false,
hasOpenSSL: true,
fipsMode: true,
hasIntl: true,
hasSmallICU: true,
hasTracing: true,
hasNodeOptions: true,
hasInspector: true,
noBrowserGlobals: false,
bits: 64,
hasDtrace: true,
hasCachedBuiltins: true
}
We can see that fipsMode has be set which is also correct. But why does getFips()
return 0? Because you have to also enable fips:
```console
$ ./node --enable-fips -p "require('crypto').getFips()"
1
Enable fips in a container (UBI8/RHEL8):
$ update-crypto-policies --set FIPS
$ fips-mode-setup --enable --no-bootcfg
$ export OPENSSL_FORCE_FIPS_MODE=true
OpenSSL FIPS Object Module
OpenSSL itself is not validated, and never will be. Instead a carefully defined software component called the OpenSSL FIPS Object Module has been created. The Module was designed for compatibility with the OpenSSL library so products using the OpenSSL library and API can be converted to use FIPS 140-2 validated cryptography with minimal effort.
FIPS Mode in which the FIPS approved algorithms are implemented by the FIPS Object Module and non-FIPS approved algorithms are disabled by default. These non-validated algorithms include, but are not limited to, Blowfish, CAST, IDEA, RC-family, and non-SHA message digest and other algorithms.
The v1.2.x Module is only compatible with OpenSSL 0.9.8 releases, while the v2.0 Module is compatible with OpenSSL 1.0.1 and 1.0.2 releases. The v2.0 Module is the best choice for all new software and product development.
After following the instructions to configure OpenSSL with FIPS support, building Node can be done using the following commands:
$ ./configure --debug --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/build_1_0_2k/lib --shared-openssl-includes=/Users/danielbevenius/work/security/build_1_0_2k/include --openssl-fips=/Users/danielbevenius/work/security/build_1_0_2k/
$ make -j8 test
Crypto init
void InitCrypto(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
static uv_once_t init_once = UV_ONCE_INIT;
uv_once(&init_once, InitCryptoOnce);
Environment* env = Environment::GetCurrent(context);
SecureContext::Initialize(env, target);
SecureContext::Initialize
:
void CipherBase::Initialize(Environment* env, Local<Object> target) {
Local<FunctionTemplate> t = env->NewFunctionTemplate(New);
Lets take a look what New
does:
void SecureContext::New(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
new SecureContext(env, args.This());
}
Usage of tls would look like a normal require (though since this is an native module the loading will be done by NativeModule.require(filename)
:
const tls = require('tls');
This module exports (among other functions):
exports.createSecureContext = require('_tls_common').createSecureContext;
exports.SecureContext = require('_tls_common').SecureContext;
exports.TLSSocket = require('_tls_wrap').TLSSocket;
exports.Server = require('_tls_wrap').Server;
exports.createServer = require('_tls_wrap').createServer;
exports.connect = require('_tls_wrap').connect;
So calling tls.createSecureContext
will end up in `lib/_tls_common.js:
exports.createSecureContext = function createSecureContext(options, context) {
...
var c = new SecureContext(options.secureProtocol, secureOptions, context);
And here we find the call to SecureContext
using new
. At this point this
function will have been bound by SecureContext::Initialize
which is called
by node::crypto::Initialize
in src/node_crypto.cc
. When the tls
module
is required it will load the internal crypto
modulewhich is what will cause
node::crypto::Initialize` to be invoked.
SecureContext::Initialize(env, target)
has the following functions that is
adds:
Local<FunctionTemplate> t = env->NewFunctionTemplate(SecureContext::New);
t->InstanceTemplate()->SetInternalFieldCount(1);
t->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "SecureContext"));
env->SetProtoMethod(t, "init", SecureContext::Init);
env->SetProtoMethod(t, "setKey", SecureContext::SetKey);
env->SetProtoMethod(t, "setCert", SecureContext::SetCert);
env->SetProtoMethod(t, "addCACert", SecureContext::AddCACert);
...
target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "SecureContext"), t->GetFunction());
So, when tls.createSecureContext
returns c
will have been bound to a new
function that was set up above.
In most/all of the functions in the SecureContext class you'll find the following:
SecureContext* sc;
ASSIGN_OR_RETURN_UNWRAP(&sc, args.Holder());
If call to such a function could look like this:
c.context.addCACert('something');
The JavaScript layer will add wrap the native secure context in a object with
a context member for the native secure context, which is the SecureContext on
the C++ side. So when we call addCACert
what is unwrapped by
ASSIGN_OR_RETURN_UNWRAP
is c
which is the native context.
Remember that these are prototype functions that are being setup on instance created when using
new SecureContext
which will be an instance of CipherBase
since this is the type that the New
function returned.
When is the underlying OpenSSL Context created?
This is one in SecureContext::Init
which is called in _tls_common.js
, in the
SecureContext constructor:
this.context = new NativeSecureContext();
this.context.init(secureProtocol);
SecureContext::Init
will setup/initialize the OpenSSL context. This includes
parsing the options passed, secureProtocol
above and calling the correct
TLS_xxx_method(), and then using that SSL_METHOD
pointer to create a new
SSL_CTX.
const binding = internalBinding('crypto');
Recall that internalBinding is a function that is set on the process object when it is set up on node.cc. This will be a call to Binding:
(lldb) br s -f node.cc -l 2723
static void Binding(const FunctionCallbackInfo<Value>& args) {
Local<String> module = args[0]->ToString(env->isolate());
node::Utf8Value module_v(env->isolate(), module);
...
}
(lldb) jlh module
#crypto
Local<Array> modules = env->module_load_list_array();
(lldb) jlh modules
0x16da0049c641: [JSArray]
- map = 0x34afb7a04099 [FastProperties]
- prototype = 0xe7de5189b01
- elements = 0x39e4730d5221 <FixedArray[82]> [FAST_HOLEY_ELEMENTS]
- length = 69
- properties = 0x25494902241 <FixedArray[0]> {
#length: 0x4ddc16a6369 <AccessorInfo> (const accessor descriptor)
}
- elements = 0x39e4730d5221 <FixedArray[82]> {
0: 0x246944578d59 <String[18]: Binding contextify>
1: 0x246944578d89 <String[15]: Binding natives>
2: 0x246944578db1 <String[14]: Binding config>
3: 0x246944578dd9 <String[19]: NativeModule events>
4: 0x246944578e01 <String[18]: Binding async_wrap>
5: 0x246944578e31 <String[11]: Binding icu>
6: 0x246944578e59 <String[17]: NativeModule util>
7: 0x246944578e81 <String[10]: Binding uv>
8: 0x246944578ea9 <String[19]: NativeModule buffer>
9: 0x246944578ed1 <String[14]: Binding buffer>
10: 0x246944578ef9 <String[12]: Binding util>
11: 0x246944578f21 <String[26]: NativeModule internal/util>
12: 0x246944578f49 <String[28]: NativeModule internal/errors>
13: 0x246944578f71 <String[17]: Binding constants>
14: 0x246944578fa1 <String[28]: NativeModule internal/buffer>
15: 0x246944578fc9 <String[19]: NativeModule timers>
16: 0x246944578ff1 <String[18]: Binding timer_wrap>
17: 0x246944579021 <String[32]: NativeModule internal/linkedlist>
18: 0x246944579049 <String[24]: NativeModule async_hooks>
19: 0x246944579071 <String[19]: NativeModule assert>
20: 0x246944579099 <String[29]: NativeModule internal/process>
21: 0x2469445790c1 <String[37]: NativeModule internal/process/warning>
22: 0x2469445790e9 <String[39]: NativeModule internal/process/next_tick>
23: 0x246944579111 <String[38]: NativeModule internal/process/promises>
24: 0x246944579139 <String[35]: NativeModule internal/process/stdio>
25: 0x246944579161 <String[25]: NativeModule internal/url>
26: 0x246944579189 <String[33]: NativeModule internal/querystring>
27: 0x2469445791b1 <String[24]: NativeModule querystring>
28: 0x2469445791d9 <String[11]: Binding url>
29: 0x246944579201 <String[17]: NativeModule path>
30: 0x246944579229 <String[19]: NativeModule module>
31: 0x246944579251 <String[28]: NativeModule internal/module>
32: 0x246944579279 <String[15]: NativeModule vm>
33: 0x2469445792a1 <String[15]: NativeModule fs>
34: 0x39e4730e0651 <String[10]: Binding fs>
35: 0x39e4730e0679 <String[19]: NativeModule stream>
36: 0x39e4730e06a1 <String[36]: NativeModule internal/streams/legacy>
37: 0x39e4730e06c9 <String[29]: NativeModule _stream_readable>
38: 0x39e4730e06f1 <String[40]: NativeModule internal/streams/BufferList>
39: 0x39e4730e0719 <String[37]: NativeModule internal/streams/destroy>
40: 0x39e4730e0741 <String[29]: NativeModule _stream_writable>
41: 0x39e4730e0769 <String[27]: NativeModule _stream_duplex>
42: 0x39e4730e0791 <String[30]: NativeModule _stream_transform>
43: 0x39e4730e07b9 <String[32]: NativeModule _stream_passthrough>
44: 0x39e4730e07e1 <String[21]: Binding fs_event_wrap>
45: 0x39e4730e0811 <String[24]: NativeModule internal/fs>
46: 0x39e4730e0839 <String[17]: Binding inspector>
47: 0x21515bc747c9 <String[15]: NativeModule os>
48: 0x21515bc747f1 <String[10]: Binding os>
49: 0x21515bc74819 <String[26]: NativeModule child_process>
50: 0x21515bc74841 <String[18]: Binding spawn_sync>
51: 0x21515bc74871 <String[17]: Binding pipe_wrap>
52: 0x21515bc748a1 <String[35]: NativeModule internal/child_process>
53: 0x21515bc748c9 <String[27]: NativeModule string_decoder>
54: 0x21515bc748f1 <String[16]: NativeModule net>
55: 0x21515bc74919 <String[25]: NativeModule internal/net>
56: 0x21515bc74941 <String[18]: Binding cares_wrap>
57: 0x21515bc74971 <String[16]: Binding tty_wrap>
58: 0x21515bc74999 <String[16]: Binding tcp_wrap>
59: 0x21515bc749c1 <String[19]: Binding stream_wrap>
60: 0x21515bc749f1 <String[18]: NativeModule dgram>
61: 0x21515bc74a19 <String[16]: Binding udp_wrap>
62: 0x21515bc74a41 <String[20]: Binding process_wrap>
63: 0x21515bc74a71 <String[33]: NativeModule internal/socket_list>
64: 0x21515bc74a99 <String[20]: NativeModule console>
65: 0x21515bc74ac1 <String[16]: NativeModule tty>
66: 0x21515bc74ae9 <String[19]: Binding signal_wrap>
67: 0x35756b2729e1 <String[16]: NativeModule tls>
68: 0xd6549a0f479 <String[16]: NativeModule url>
69-81: 0x25494902351 <the hole>
}
A NativeModule
is a module which has access to the module property and implemented in JavaScript.
A Binding
is something that does not have a module and only set exports.
modules->Set(l, OneByteString(env->isolate(), buf));
(lldb) p buf
(char [1024]) $16 = "Binding crypto"
node_module* mod = get_builtin_module(*module_v);
(lldb) p *mod
(node::node_module) $24 = {
nm_version = 55
nm_flags = 1
nm_dso_handle = 0x0000000000000000
nm_filename = 0x0000000101bc4053 "../src/crypto_impl/node_crypto.cc"
nm_register_func = 0x0000000000000000
nm_context_register_func = 0x000000010180f630 (node`node::crypto::InitCrypto(v8::Local<v8::Object>, v8::Local<v8::Value>, v8::Local<v8::Context>, void*) at node_crypto.cc:6230)
nm_modname = 0x0000000101baffde "crypto"
nm_priv = 0x0000000000000000
nm_link = 0x00000001027c54c0
}
In this case I'm actually documenting and troubleshooting which is the reason for the strange path names to the source files.
if (mod != nullptr) {
exports = Object::New(env->isolate());
// Internal bindings don't have a "module" object, only exports.
CHECK_EQ(mod->nm_register_func, nullptr);
CHECK_NE(mod->nm_context_register_func, nullptr);
Local<Value> unused = Undefined(env->isolate());
mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);
cache->Set(module, exports);
So we check that the nm_register_func
is not set but we should have a context aware register node module function but set the exports
instance to Undefined.
Next mod->nm_context_register_func
is called which was configured in node_crypto.cc:
NODE_MODULE_CONTEXT_AWARE_BUILTIN(crypto, node::crypto::InitCrypto)
So InitCrypto will be the function we end up in which has the call to SecureContext::Initialize(env, target);
which I wanted to know how it ended up there.
InitCrypto
This will intialize the following:
SecureContext::Initialize(env, target);
Connection::Initialize(env, target); // SSLConnection
CipherBase::Initialize(env, target);
DiffieHellman::Initialize(env, target);
ECDH::Initialize(env, target);
Hmac::Initialize(env, target);
Hash::Initialize(env, target);
Sign::Initialize(env, target);
Verify::Initialize(env, target);
Should the constructors for these be checking that they are called with new? Even if these are internal it might be possible to call them using:
const binding = process.binding('crypto');
//const h = new binding.Hmac();
const h = binding.Hmac();
Builtins/Constants/Natives
Using process.binding you can load builtin modules, the constant module, or native modules. The builtin modules are those that are initialized by the loader, for example:
NODE_MODULE_CONTEXT_AWARE_BUILTIN(config, node::InitConfig)
The above module would then be loaded using process.binding('config').
If you look at the Binding
function is src/node.cc you can see that there are two special cases if the module is not
found as a builtin. One if the name passed in is constants
and one if the name passed in is natives
.
constants
Just a note here about how src/node_constant.cc is loaded as there is no NODE_MOUDULE_CONTEXT_AWARE_BUILTIN
macro or anything like that. Instead this will be loaded when:
process.binding('constants').
Which like mentioned earlier in this document this will invoke Binding
(in node.cc) and there is a special clause for constants
:
} else if (!strcmp(*module_v, "constants")) {
exports = Object::New(env->isolate());
CHECK(exports->SetPrototype(env->context(), Null(env->isolate())).FromJust());
DefineConstants(env->isolate(), exports);
cache->Set(module, exports);
DefineConstants
:
DefineErrnoConstants(err_constants);
DefineWindowsErrorConstants(err_constants);
DefineSignalConstants(sig_constants);
DefineUVConstants(os_constants);
DefineSystemConstants(fs_constants);
DefineOpenSSLConstants(crypto_constants);
DefineCryptoConstants(crypto_constants);
DefineZlibConstants(zlib_constants);
os_constants->Set(OneByteString(isolate, "errno"), err_constants);
os_constants->Set(OneByteString(isolate, "signals"), sig_constants);
target->Set(OneByteString(isolate, "os"), os_constants);
target->Set(OneByteString(isolate, "fs"), fs_constants);
target->Set(OneByteString(isolate, "crypto"), crypto_constants);
target->Set(OneByteString(isolate, "zlib"), zlib_constants);
Natives
When you see:
NativeModule._source = process.binding('natives');
Similar to when process.binding('constants')
is used there is a special clause in Binding for this:
} else if (!strcmp(*module_v, "natives")) {
exports = Object::New(env->isolate());
DefineJavaScript(env, exports);
cache->Set(module, exports);
DefineJavaScript
is declared in src/node_javascript.h
but as you might recall there is no implementation in the source code tree for this header the source file is generated
using the JavaScript source files and config.gypi that is generated by configure:
'action_name': 'node_js2c',
'process_outputs_as_sources': 1,
'inputs': [
'<@(library_files)',
'./config.gypi',
],
'outputs': [
'<(SHARED_INTERMEDIATE_DIR)/node_javascript.cc',
...
So we can see that the JavaScript library files are included and config.gypi.
If you call process.binding('config')
what will be returned will be a builtin module as there is one registered:
NODE_MODULE_CONTEXT_AWARE_BUILTIN(config, node::InitConfig)
But this does not populate the process objects config variables which you might think. Instead that is done in node_bootstrap.js:
const _process = NativeModule.require('internal/process');
_process.setupConfig(NativeModule._source);
When I was working on decoupling OpenSSL from node.cc I missed out the macros that are used to conditionally set various settings. For example in GetFeatures
we have:
#ifdef SSL_CTRL_SET_TLSEXT_SERVERNAME_CB
Local<Boolean> tls_sni = True(env->isolate());
#else
Local<Boolean> tls_sni = False(env->isolate());
#endif
obj->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "tls_sni"), tls_sni);
Fix formatting in src/node_crypto.cc X509ToObject
TLS OpenSSL 1.1.0 Issue
I'm tryin to make test/parallel/test-tls-ecdh-disable.js to be used with both OpenSSL 1.0.x and 1.1.x. The issue is that the test successfully listens and the mustNotCall function is called which is the callback passed to the listener event handler registration.
The cipher in use is ECDHE-RSA-AES128-GCM-SHA256
. We can check what ciphers are supported by using the following command:
$ ~/work/security/build_1_1_0f/bin/openssl ciphers -v 'ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS'
ECDHE-ECDSA-AES256-GCM-SHA384 TLSv1.2 Kx=ECDH Au=ECDSA Enc=AESGCM(256) Mac=AEAD
ECDHE-RSA-AES256-GCM-SHA384 TLSv1.2 Kx=ECDH Au=RSA Enc=AESGCM(256) Mac=AEAD
ECDHE-ECDSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH Au=ECDSA Enc=AESGCM(128) Mac=AEAD
ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH Au=RSA Enc=AESGCM(128) Mac=AEAD <----
DHE-RSA-AES256-GCM-SHA384 TLSv1.2 Kx=DH Au=RSA Enc=AESGCM(256) Mac=AEAD
DHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=DH Au=RSA Enc=AESGCM(128) Mac=AEAD
ECDHE-ECDSA-AES256-CCM8 TLSv1.2 Kx=ECDH Au=ECDSA Enc=AESCCM8(256) Mac=AEAD
ECDHE-ECDSA-AES256-CCM TLSv1.2 Kx=ECDH Au=ECDSA Enc=AESCCM(256) Mac=AEAD
ECDHE-ECDSA-AES256-SHA384 TLSv1.2 Kx=ECDH Au=ECDSA Enc=AES(256) Mac=SHA384
ECDHE-RSA-AES256-SHA384 TLSv1.2 Kx=ECDH Au=RSA Enc=AES(256) Mac=SHA384
ECDHE-ECDSA-AES256-SHA TLSv1 Kx=ECDH Au=ECDSA Enc=AES(256) Mac=SHA1
ECDHE-RSA-AES256-SHA TLSv1 Kx=ECDH Au=RSA Enc=AES(256) Mac=SHA1
DHE-RSA-AES256-CCM8 TLSv1.2 Kx=DH Au=RSA Enc=AESCCM8(256) Mac=AEAD
DHE-RSA-AES256-CCM TLSv1.2 Kx=DH Au=RSA Enc=AESCCM(256) Mac=AEAD
DHE-RSA-AES256-SHA256 TLSv1.2 Kx=DH Au=RSA Enc=AES(256) Mac=SHA256
DHE-RSA-AES256-SHA SSLv3 Kx=DH Au=RSA Enc=AES(256) Mac=SHA1
ECDHE-ECDSA-AES128-CCM8 TLSv1.2 Kx=ECDH Au=ECDSA Enc=AESCCM8(128) Mac=AEAD
ECDHE-ECDSA-AES128-CCM TLSv1.2 Kx=ECDH Au=ECDSA Enc=AESCCM(128) Mac=AEAD
ECDHE-ECDSA-AES128-SHA256 TLSv1.2 Kx=ECDH Au=ECDSA Enc=AES(128) Mac=SHA256
ECDHE-RSA-AES128-SHA256 TLSv1.2 Kx=ECDH Au=RSA Enc=AES(128) Mac=SHA256
ECDHE-ECDSA-AES128-SHA TLSv1 Kx=ECDH Au=ECDSA Enc=AES(128) Mac=SHA1
ECDHE-RSA-AES128-SHA TLSv1 Kx=ECDH Au=RSA Enc=AES(128) Mac=SHA1
DHE-RSA-AES128-CCM8 TLSv1.2 Kx=DH Au=RSA Enc=AESCCM8(128) Mac=AEAD
DHE-RSA-AES128-CCM TLSv1.2 Kx=DH Au=RSA Enc=AESCCM(128) Mac=AEAD
DHE-RSA-AES128-SHA256 TLSv1.2 Kx=DH Au=RSA Enc=AES(128) Mac=SHA256
DHE-RSA-AES128-SHA SSLv3 Kx=DH Au=RSA Enc=AES(128) Mac=SHA1
AES256-GCM-SHA384 TLSv1.2 Kx=RSA Au=RSA Enc=AESGCM(256) Mac=AEAD
AES128-GCM-SHA256 TLSv1.2 Kx=RSA Au=RSA Enc=AESGCM(128) Mac=AEAD
AES256-CCM8 TLSv1.2 Kx=RSA Au=RSA Enc=AESCCM8(256) Mac=AEAD
AES256-CCM TLSv1.2 Kx=RSA Au=RSA Enc=AESCCM(256) Mac=AEAD
AES128-CCM8 TLSv1.2 Kx=RSA Au=RSA Enc=AESCCM8(128) Mac=AEAD
AES128-CCM TLSv1.2 Kx=RSA Au=RSA Enc=AESCCM(128) Mac=AEAD
AES256-SHA256 TLSv1.2 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA256
AES128-SHA256 TLSv1.2 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA256
AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1
AES128-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA1
Zlib
The version we are using on Fedora is:
zlib: '1.2.8'
The version building locally is:
zlib: '1.2.11',
The issue I'm seeing is that test/parallel/test-zlib-failed-init.js
passes as expected when using version 1.2.11 but fails
when using 1.2.8.
The change log for zlib can be found here: http://zlib.net/ChangeLog.txt
Changes in 1.2.9 (31 Dec 2016)
...
- Reject a window size of 256 bytes if not using the zlib wrapper
...
ICU
We are currently building using --with-intl=system-icu which gives the following icu version:
icu: '57.1'
When I build locally and without any icu flags the version is:
The issue I'm seeing is that test/parallel/test-icu-data-dir.js is failing:
assert.js:60
throw new errors.AssertionError({
^
AssertionError [ERR_ASSERTION]: false == true
at Object.<anonymous> (/root/rpmbuild/BUILD/node-v8.1.0/test/parallel/test-icu-data-dir.js:13:3)
const child = spawnSync(process.execPath, ['--icu-data-dir=/', '-e', '0']);
assert(child.stderr.toString().includes(expected));
Fips
When using the bundled/deps version of OpenSSL and the just building without any configuration options, OpenSSL FIPS support is not enabled.
The test test/parallel/test-crypto-fips.js
performs a number of checks and assumes the above default configuration. I'm running into an issue when configuring Node using --shared-openssl and the system version of OpenSSL supports FIPS (but I'm not setting anything FIPS related when configuring only the shared includes and shared lib directory. When I do this the mentioned test fails when trying to toggle fips_mode using a OpenSSL configuration file.
[root@1b05bc2e415c node-v8.1.0]# openssl version
OpenSSL 1.0.2k-fips 26 Jan 2017
out/Release/node test/parallel/test-icu-data-dir.js:
Spawned child [pid:9757] with cmd 'require("crypto").fips' expect 0 with args '--openssl-config=/root/rpmbuild/BUILD/node-v8.1.0/test/fixtures/openssl_fips_enabled.cnf' OPENSSL_CONF=undefined
assert.js:60
throw new errors.AssertionError({
^
AssertionError [ERR_ASSERTION]: 0 === 1
at responseHandler (/root/rpmbuild/BUILD/node-v8.1.0/test/parallel/test-crypto-fips.js:56:14)
You might have to manually remove config_fips.gypi
when you want to reconfigure fips.
Building the bundled openssl with fips
$ ./configure --openssl-fips=/Users/danielbevenius/work/security/build_1_0_2k/
$ out/Release/node
> process.versions.openssl
'1.0.2l-fips'
Node N-API
Lets start from the beginning and look at the initialization of a napi addons:
NAPI_MODULE(NODE_GYP_MODULE_NAME, Init)
This macro is defined in src/node_api.h
:
#define NAPI_MODULE_X(modname, regfunc, priv, flags) \
EXTERN_C_START \
static napi_module _module = \
{ \
NAPI_MODULE_VERSION, \
flags, \
__FILE__, \
regfunc, \
#modname, \
priv, \
{0}, \
}; \
NAPI_C_CTOR(_register_ ## modname) { \
napi_module_register(&_module); \
} \
EXTERN_C_END
#define NAPI_MODULE(modname, regfunc) \
NAPI_MODULE_X(modname, regfunc, NULL, 0) // NOLINT (readability/null_usage)
This is basically the same as what we have seen for a normal addons and would expend to:
extern "C" {
static napi_module _module = {
NAPI_MODULE_VERSION,
flags,
__FILE__,
Init,
NODE_GYP_MODULE_NAME,
priv,
{0},
};
static void _register_NODE_GYP_MODULE_NAME(void) __attribute__((constructor));
static void _register_NODE_GYP_MODULE_NAME(void) {
napi_module_register(&_module);
}
}
It's run when a shared library is loaded.
void napi_module_register(napi_module* mod) {
node::node_module* nm = new node::node_module {
-1,
mod->nm_flags,
nullptr,
mod->nm_filename,
nullptr,
napi_module_register_cb,
mod->nm_modname,
mod, // priv
nullptr,
};
node::node_module_register(nm);
}
Let's set a break point on NAPI_MODULE_INIT
and take a closer look at it:
$ ./configure --debug && make -j8
$ make build-addons-napi
$ lldb -- ./node test/addons-napi/1_hello_world/test.js
(lldb) br s -f binding.c -l 13
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) r
Doing a back trace we will see
(lldb) bt 10
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x00000001027adc5b binding.node`_register_binding at binding.c:13
frame #1: 0x00000001026eeac6 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 420
frame #2: 0x00000001026eecf6 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
frame #3: 0x00000001026ea218 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 330
frame #4: 0x00000001026e934e dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 134
frame #5: 0x00000001026e93e2 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 74
frame #6: 0x00000001026dd3e5 dyld`dyld::runInitializers(ImageLoader*) + 82
frame #7: 0x00000001026e60a8 dyld`dlopen + 527
frame #8: 0x00007fff6704fd86 libdyld.dylib`dlopen + 86
frame #9: 0x000000010003b238 node`node::DLOpen(v8::FunctionCallbackInfo<v8::Value> const&) [inlined] node::DLib::Open() at node.cc:1158 [opt]
frame #10: 0x000000010003b224 node`node::DLOpen(args=<unavailable>) at node.cc:1253 [opt]
The example we are looking at is test/addons-napi/1_hello_world
and
const binding = require(`./build/${common.buildType}/binding`);
Notice that this does not have a file extension, and one of the file extensions that will be tried is .node
.
which will be test/addons-napi/1_hello_world/build/Debug/binding.node
.
From lib/internal/modules/cjs/loader.js
:
// Native extension for .node
Module._extensions['.node'] = function(module, filename) {
return process.dlopen(module, path.toNamespacedPath(filename));
};
Each addon is a dynamical library which is loaded by LDOpen in src/node.cc
NAPI_MODULE_INIT() {
napi_property_descriptor desc = DECLARE_NAPI_PROPERTY("hello", Method);
NAPI_CALL(env, napi_define_properties(env, exports, 1, &desc));
return exports;
Note that DECLARE_NAPI_PROPERTY
is a macro defined in test/common.h
and just initializes the struct with:
napi_property_descriptor desc = { "hello", 0, Method, 0, 0, 0, napi_default, 0);
The napi_property_descriptor
struct look like this:
typedef struct {
// One of utf8name or name should be NULL.
const char* utf8name;
napi_value name;
napi_callback method;
napi_callback getter;
napi_callback setter;
napi_value value;
napi_property_attributes attributes;
void* data;
} napi_property_descriptor;
napi_callback
is a function pointer, which is getting set to 0. Why is this not set to NULL?
typedef napi_value (*napi_callback)(napi_env env, napi_callback_info info);
napi_default
is an enum in src/node_api_types.h
:
typedef enum {
napi_default = 0,
napi_writable = 1 << 0,
napi_enumerable = 1 << 1,
napi_configurable = 1 << 2,
// Used with napi_define_class to distinguish static properties
// from instance properties. Ignored by napi_define_properties.
napi_static = 1 << 10,
} napi_property_attributes;
All JavaScript values are abstracted behind an opaque type named napi_value. This is a way hide details from the using code
and they don't have access to the actual implementation of the struct. In napi_value
's case it is just a pointer and
no additional information exist, but for other types like napi_env:
typedef struct napi_env__* napi_env;
napi_env__
contains information like the Isolate, the node::Environment for example and is defined in src/node_api.cc which
is the implementation. So the while you can see the contents of the struct in the debugger trying to access any member
information will get an error like the following:
../binding.c:7:6: error: incomplete definition of type 'struct napi_env__'
env->isolate;
~~~^
/Users/danielbevenius/work/nodejs/node-poc/src/node_api_types.h:13:16: note: forward declaration of 'struct napi_env__'
typedef struct napi_env__* napi_env;
^
1 error generated.
The way these types are used is by passing them to functions that operate on them, and those functions have access to the internal type definitions. This is also what OpenSSL does for it's type system.
FIXED_ONE_BYTE_STRING
This is a macro which can be found in src/util.h
:
// Used to be a macro, hence the uppercase name.
template <int N>
inline v8::Local<v8::String> FIXED_ONE_BYTE_STRING(
v8::Isolate* isolate,
const char(&data)[N]) {
return OneByteString(isolate, data, N - 1);
}
const char(&data)
is the same as const char&[]
const char (&)[]
So we are passing in the string which recieved as a const char* in the OneByteString function. This is then reinterpreted to const uint8_t*.
OneByteString: the string's bytes are not UTF-8 encoded, can only contain characters in the first 256 unicode code points TwoByteString: Same, but uses two bytes for each character (using surrogate pairs to represent unicode characters that can't be represented in two bytes).
Binary Compatability
Is when a program is linked dynmically to a former version and does not have to be recompiled. If a program needs to be recompiled to run with a new version of library but doesn't require any further modifications, the library is source compatible.
Application Binary Interface (ABI)
An ABI defines the structures and methods used to access external, already compiled libraries/code at the level of machine code. The headers decsribing classes, functions etc are compiled to a set of addresses and expected parameters, memory structure sizes and layout. The application using the ABI must be compiled so that these addresses, the expected paramters, memory layout etc match those that that the ABI provider provided. Changes to the header files can be made but must be done carfully to ensure that the compability is not changed, so that the addresses produced when compiling are not changed which would mean that exising users of the compiled code would become incompatible.
Keeping an ABI stable means not changing function interfaces (return type and number, types, and order of arguments), definitions of data types or data structures, defined constants, etc. New functions and data types can be added, but existing ones must stay the same.
Making a namespace change in C++ will generate a different mangled name so the symbol in the object file will be different.
In Node there is NODE_MODULE_VERSION
which is defined in node_version.h
C++ Lint task
Verify that functions that return pointers have the pointer operator in the correct place (to the left).
The rules can be found in 'tools/cpplint.py' and can be run using:
tools/cpplint.py files
The script will run through the files and for each one call ProcessFile which does some checking and then calls ProcessFileData and later ProcessLine. Not that you can pass in extra_check_functions which might become handy.
NDEBUG
I've seen this in couple of places in the node source code and did not know about it. NDBUG is used to toggle/control whether the assert macro will expand into something that will perform a check or not.
From <assert.h>
:
#ifdef NDEBUG
#define assert(condition) ((void)0)
#else
#define assert(condition)
#endif
assert
This macro is disabled if, at the moment of including <assert.h>, a macro with the name NDEBUG has already been defined. This allows for a coder to include as many assert calls as needed in a source code while debugging the program and then disable all of them for the production version by simply including a line like:
#define NDEBUG
If the NDEBUG macro is defined when <assert.h> is included the assets are disabled. While looking into adding a addons test I noticed that at-exit undefined NDEBUG but not the other tests that use assert. As far as I can tell there is no need to undefine this and non of the other tests do. This commit removes the undef for consistency.
Building a addons
Change to the directory of the addons and then you can rebuild using:
$ out/Debug/node deps/npm/node_modules/node-gyp/bin/node-gyp.js rebuild --directory=test/addons/at-exit --nodedir=../../../
On windows you can just copy the command from the output from, for example:
"C:\\Users\\danbev\\working\\node\\Release\\node.exe" "C:\\Users\\danbev\\working\\node\\deps\\npm\\node_modules\\node-gyp\\bin\\node-gyp" "rebuild" "--directory=test\\addons\\openssl-client-cert-engine" "--nodedir=C:\\Users\\danbev\\working\\node"
Run an a set of JavaScript tests
You can specify the directory to run test, for example this would only run the async-hooks tests:
$ python tools/test.py --mode=release -J async-hooks
Event Loop
It all starts with the javascript file to be executed, which is you main program.
------------> javascript.js ------+-----------------------------+
/|\ | setTimeout/SetInterval
| | JavaScript callbacks --------------------------------------+
| | |
| | network/disk/child_processes |
| | JavaScript callbacks --------------------------------------+
| | |----> callback ------> nextTick callback ------------------------------+
| | setImmedate | /|\ |
+ | JavaScript callbacks --------------------------------------| | |
\ | | +---- process resolved promises ---------+
\ | close events |
\ | JavaScript callbacks --------------------------------------|
\ |
\ |
<----------- process.exit (event) <-----------------------------+
Where is the first interaction with libuv in node? There very first call is (in node.cc):
argv = uv_setup_args(argc, argv);
But this does not do anything with the event loop. For that we have to look at Init
:
prog_start_time = static_cast<double>(uv_now(uv_default_loop()));
This will land us in uv_common.c:
uv_loop_t* uv_default_loop(void) {
if (default_loop_ptr != NULL)
return default_loop_ptr;
if (uv_loop_init(&default_loop_struct))
(lldb) p default_loop_ptr
(uv_loop_t *) $4 = 0x0000000000000000
So we can see that this is the first call to uv_default_loop
so lets take a closer look at uv_loop_init
in src/unix/loop.c (in libuv that is):
uv__signal_global_once_init();
Which will call:
uv_once(&uv__signal_global_init_guard, uv__signal_global_init);
I just skimmed the code but this looks like setting up fork handlers to reset signals. uv_once uses pthread_once. After this we will be back in loop.c:
heap_init((struct heap*) &loop->timer_heap);
Where are initializing a min heap data structur for timers.
QUEUE_INIT(&loop->wq)
wq
is a work queue (include/uv-threadpool.h):
void* wq[2];
What does the macro QUEUE_INIT
do?
It is defined in src/queue.h
as:
QUEUE_NEXT(q) = (q); \
QUEUE_PREV(q) = (q);
typedef void *QUEUE[2];
...
#define QUEUE_NEXT(q) (*(QUEUE **) &((*(q))[0]))
This will set &loop->wp[0]
And the same goes for handle_queue and active_reqs which will be initialized using QUEUE_INIT as well:
void* handle_queue[2];
void* active_reqs[2];
QUEUE_INIT(&loop->active_reqs);
QUEUE_INIT(&loop->idle_handles);
...
err = uv__platform_loop_init(loop);
uv__platform_loop_init
can be found in src/unix/darwin.c:
if (uv__kqueue_init(loop))
uv__kqueue_init
can be found in src/unix/kqueue.c:
loop->backend_fd = kqueue()
Resolve promises
After calling the callback provided by the user, a smaller loop will check if there are any resolved promises.
Where is this done? If we take a look at lib/internal/bootstrap.js and the start function we find the following line:
NativeModule.require('internal/process/next_tick').setup();
And lib/internal/process/next_tick.js:
exports.setup = setupNextTick;
setupNextTick
then does:
const promises = require('internal/process/promises');
...
const emitPendingUnhandledRejections = promises.setup(scheduleMicrotasks);
promises.setup will call process._setupPromises:
process._setupPromises(function(event, promise, reason) {
Notice that this call takes an anonymous function as its callback. _setupPromises
is configured in node.cc in the SetupProcessObject function:
env->SetMethod(process, "_setupPromises", SetupPromises);
Lets set a break point in SetupPromis and see what is going on:
(lldb) br s -f node.cc -l 1278
(lldb) r
isolate->SetPromiseRejectCallback(PromiseRejectCallback);
So an isolate has a promise_reject_callback_
which is being set here to PromiseRejectCallback
. This is what V8 will call if a promise is rejected.
env->set_promise_reject_function(args[0].As<Function>());
Remember _setupPromises
was passed an anonymous function as its callback argument which is what is being set as the
set_promise_reject_function on the evnvironment instance. This will then be used by PromiseRejectCallback
:
Local<Function> callback = env->promise_reject_function();
...
callback->Call(process, arraysize(args), args);
Next things that happens in SetupPromises
is that _setupPromises
is deleted from the process object:
env->process_object()->Delete(
env->context(),
FIXED_ONE_BYTE_STRING(args.GetIsolate(), "_setupPromises")).FromJust();
This way it cannot be called again.
Back in JavaScript (setupNextTick) now we have:
var _runMicrotasks = {};
...
const tickInfo = process._setupNextTick(_tickCallback, _runMicrotasks);
process._setupNextTick
is also configured in node.cc:
env->SetMethod(process, "_setupNextTick", SetupNextTick);
The arguments to SetupNextTick will be:
CHECK(args[0]->IsFunction()); // _tickCallback
CHECK(args[1]->IsObject()); // _runMicrotasks which is just an empty object at this point.
env->set_tick_callback_function(args[0].As<Function>());
env->SetMethod(args[1].As<Object>(), "runMicrotasks", RunMicrotasks);
RunMicroTasks does:
void RunMicrotasks(const FunctionCallbackInfo<Value>& args) {
args.GetIsolate()->RunMicrotasks();
}
So we are setting a method on the _runMicrotasks object named runMicrotasks.
Next _setupNextTick
is removed from the process object. Then we have:
uint32_t* const fields = env->tick_info()->fields();
uint32_t const fields_count = env->tick_info()->fields_count();
Local<ArrayBuffer> array_buffer = ArrayBuffer::New(env->isolate(), fields, sizeof(*fields) * fields_count);
args.GetReturnValue().Set(Uint32Array::New(array_buffer, 0, fields_count));
const p = new Promise((resolve, reject) => {
resolve('ok');
});
So how are promises created in V8?
(lldb) br s -f isolate.cc -l 1862
This will break in isolate.cc PushPromise
:
void Isolate::PushPromise(Handle<JSObject> promise) {
ThreadLocalTop* tltop = thread_local_top();
PromiseOnStack* prev = tltop->promise_on_stack_;
Handle<JSObject> global_promise = global_handles()->Create(*promise);
tltop->promise_on_stack_ = new PromiseOnStack(global_promise, prev);
}
Lets take a closer look at new PromiseOnStack
:
class PromiseOnStack {
public:
PromiseOnStack(Handle<JSObject> promise, PromiseOnStack* prev)
: promise_(promise), prev_(prev) {}
Handle<JSObject> promise() { return promise_; }
PromiseOnStack* prev() { return prev_; }
private:
Handle<JSObject> promise_;
PromiseOnStack* prev_;
};
So that looks simple enough, it has a promise and a pointer to the previous promise. The callback passed to Promise (which takes a resolve, reject), when is it called? This is part of a constructor call so it would be executed straight way I think, like any constructor.
PromiseRejectCallback
void PromiseRejectCallback(PromiseRejectMessage message) {
Local<Promise> promise = message.GetPromise();
Isolate* isolate = promise->GetIsolate();
Local<Value> value = message.GetValue();
Local<Integer> event = Integer::New(isolate, message.GetEvent());
Environment* env = Environment::GetCurrent(isolate);
Local<Function> callback = env->promise_reject_function();
if (value.IsEmpty())
value = Undefined(isolate);
Local<Value> args[] = { event, promise, value };
Local<Object> process = env->process_object();
callback->Call(process, arraysize(args), args);
}
After a module has been loaded in Module.runMain, the process._tickCallback function will first process all the callbacks in the nextTickQueue and then call _runMicrotasks().
nextTick
After resolving all the promises and callbacks added using nextTick will be called.
// bootstrap main module.
Module.runMain = function() {
// Load the main module--the command line argument.
Module._load(process.argv[1], null, true);
// Handle any nextTicks added in the first tick of the program
process._tickCallback();
};
Libuv Thread pool
The following modules use the thread pool:
- fs.*
- dns.lookup This is because when using a hostname to lookup there are operating system specific tasks that have to be performed, like on unix /etc/hosts should be used etc.
- pipes
- crypto (crypto.randomBytes(), and crypto.pbkdf2())
- http.get/request() is called with a name in which case dns.loopup() is used
- any c++ addons that uses it
The follwing modules use the Kernel:
- tcp/udp sockets, servers
- unix domain sockets, servers
- pipes
- tty input
- dns.resolveXXX
The default size of the thread pool is 4 (uv_threadpool_size). Just to be clear, the libuv Event Loop is single threaded. The thread pool is used for file I/O operations.
SEGV_MAPERR
Is a segmentation fault, which is an invalid memory access. This can happen when trying to access a page that is not mapped into the address space of the running process. This can happen when dereferencing a null pointer or a pointer that was corrupted with a small integer value.
core dump
Make sure that you specified enough room for a core dump:
$ ulimit -c unlimited
If possible compile the executable with debugging options. The example I'm using for this is a case where
cctest on arm7 produces a core dump:
```console
[----------] 2 tests from EnvironmentTest
[ RUN ] EnvironmentTest.AtExitWithEnvironment
[ OK ] EnvironmentTest.AtExitWithEnvironment (114 ms)
[ RUN ] EnvironmentTest.AtExitWithArgument
Received signal 11 SEGV_MAPERR 000007a0547c
==== C stack trace ===============================
[end of stack trace]
Segmentation fault (core dumped)
$ gdb out/Debug/cctest qemu_cctest_20170809-174708_18638.core
Reading symbols from out/Release/cctest...done.
[New LWP 18638]
[New LWP 18644]
[New LWP 18645]
[New LWP 18646]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `out/Release/cctest'.
#0 0x005e4f8c in v8::Context::Exit() ()
(gdb) bt
#0 0x005e4f8c in v8::Context::Exit() ()
#1 0x01033d94 in node::Environment::Start(int, char const* const*, int, char const* const*, bool) ()
#2 0x010395f4 in node::CreateEnvironment(node::IsolateData*, v8::Local<v8::Context>, int, char const* const*, int, char const* const*) ()
#3 0x00e6087c in EnvironmentTest_AtExitWithArgument_Test::TestBody() ()
#4 0x00eafb12 in testing::Test::Run() ()
#5 0x00eafcfc in testing::TestInfo::Run() [clone .part.402] ()
#6 0x00eafde0 in testing::TestCase::Run() [clone .part.403] ()
#7 0x00eb15b0 in testing::internal::UnitTestImpl::RunAllTests() [clone .part.407] ()
#8 0x00eb17e6 in testing::UnitTest::Run() ()
#9 0x004302dc in main ()
I was not able to reproduce this issue when compiling with debugging symbols (./configure --debug). Possible reasons for this? The debugger puts more on the stack and if you overwrite an arrays capacity it migth cause a segment fault in release mode but not in debug mode.
Trying to find out where v8::Context::Exit() is call when in node::Environment::Start. Upon entry there is the following:
Context::Scope context_scope(context());
This is the context scope used and this is using RAII, and the descructor that will be called when this instance goes out of scope looks like this (in v8.h):
V8_INLINE ~Scope() { context_->Exit(); }
Workaround issue with test/async-hooks/init-hooks.js
$ launchctl unload -w /System/Library/LaunchAgents/com.apple.ReportCrash.plist
$ sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.ReportCrash.Root.plist
LIKELY/UNLIKELY
If you take a look at the CHECK macro you will se that it is defined like this:
#define CHECK(expr) \
do { \
if (UNLIKELY(!(expr))) { \
static const char* const args[] = { __FILE__, STRINGIFY(__LINE__), \
#expr, PRETTY_FUNCTION_NAME }; \
node::Assert(&args); \
} \
} while (0)
And UNLIKELY is defined as:
#define UNLIKELY(expr) __builtin_expect(!!(expr), 0)
This is done to give the compiler a hint that we expect the value to be true so it can optimise for that case and not the opposite case.
Context
Allows separate unrelated JavaScript to run in an isolate without interfering with each other. What does that mean for a C++ function that is called by javascript. The context should be the one associated with the call.
Tracing
Tracing refers to tracing events that can be emitted by V8. V8 takes a --enable-tracing
flag which enables
tracing and will create a v8_trace.json file. This file can be opened in chrome and inspected using
chrome://tracing
This can also be enabled in Node using --trace-events-enabled
and you can specify categories to be traced
using --trace-event-categories
.
Microsoft Build Engine (MSBuild)
Is a build platform that allows for an xml configuration file to configure and run complex build systems. This is what is produced by gyp for windows (the project file I mean). So you can take a look at the project file for the specific gyp target and see what shows up where in it to troubleshoot configuration issues.
A target is like a procedure. A target can contain properties in a PropertiesGroup and tasks.
A task is like a builtin function that performs an action, compiles, links, prints a message etc.
A project is what hooks tasks together. A task can have a DefaultTargets
attribute which are the targets
that will be run if you specify the project file on the command line for msbuild.
msbuild sample.vcxproj
The target can also be specified on the command line:
```console
msbuild sample.vcxproj /t:something
msbuild sample.vcxproj /t:Clean;Build
In the xml you might come across something like: %(AdditionalOptions). This an iterate directive.
You can increase the level of information provided by msbuild by specifying the /verbosity setting: quit, minimal, normal, detailed, and diagnostic.
Options: /MP Compile multiple sources by using multiple processes /link passes the specified option to LINK
vcbuild.bat
This is the Windows batch script that is used on Windows systems.
vcbuild.bat /help
%var% replaced with the environment variable named 'var'.
%n where n is a number from 0-9 and gives access to the n argument passed on the command line.
%* all the arguments specified on the command line
You can place quotation marks around a string to avoid escaping. You can place caret (^), an escape character, immediately before the special characters The special characters that need quoting or escaping are usually <, >, |, &, and ^
echo "You & me"
echo You ^& me
cd %~dp0 this will to %~dp (the directory of) on %0 (the first command line argument).
So if this is run from a diferent directory this will switch to the directory where
this script is.
if /i case-insensitive, for example:
if /i "%1"=="help" goto help
Configuring on windows:
python configure --openssl-system-ca-path=PATH
vcbuild noprojgen
Compiler
cl.exe
is the the compiler.
cl -help
Scripting
`%~1` removes quotes from the first command line argument
Use NUL
to discard out put, as in comman > NUL
IF "%var%"=="" (SET var=something)
IF NOT DEFINED var (SET var=something)
IF /I "%ERRORLEVEL%" NEQ "0" (
echo "failed to execute something"
)
Performace counters
Are used used to provide information as to how well the operating system or an application is performing.
DTrace
is supported on linux, solaris, mac, and bsd and is a tracing framework originally developed for Solaris
by Sun.
This is enabled by configuring with the --with-dtrace
flag. There are three object files for dtrace in
Node. These are:
node_dtrace.o which is used by all systems that support dtrace
node_dtrace_ustack.o is not supported on linux or mac
node_dtrace_provider.o is supported on all but macosx
Note that the directory where node_dtrace_provider.o is generated differs so if you are working on a task
that effects linking take that into account.
Event Tracing for Windows (ETW)
Is an efficient kernel level tracing mechanism allowing logging of kernel or application defined events to a log file. It also allows dynamic enable/disable so this can be performed on a running application.
An event provider writes events to an ETW session. Additional data is added by ETW like the time the event happened, the process that wrote it and the thread id, the processor number, the CPU usage data. This info is then available to event consumers which are applications that read the log file or the consumer can listen to realtime events and process them.
There is a condition in node.gypi that depends on node_etw
which looks like this:
[ 'node_use_etw=="true"', {
'defines': [ 'HAVE_ETW=1' ],
'dependencies': [ 'node_etw' ],
'sources': [
'src/node_win32_etw_provider.h',
'src/node_win32_etw_provider-inl.h',
'src/node_win32_etw_provider.cc',
'src/node_dtrace.cc',
'tools/msvs/genfiles/node_etw_provider.h',
'tools/msvs/genfiles/node_etw_provider.rc',
]
} ],
So remember that node_etw
will be run first so lets take a look at it first.
# generate ETW header and resource files
{
'target_name': 'node_etw',
'type': 'none',
'conditions': [
[ 'node_use_etw=="true"', {
'actions': [
{
'action_name': 'node_etw',
'inputs': [ 'src/res/node_etw_provider.man' ],
'outputs': [
'tools/msvs/genfiles/node_etw_provider.rc',
'tools/msvs/genfiles/node_etw_provider.h',
'tools/msvs/genfiles/node_etw_providerTEMP.BIN',
],
'action': [ 'mc <@(_inputs) -h tools/msvs/genfiles -r tools/msvs/genfiles' ]
}
]
} ]
]
},
mc
is the Message Compiler.
-h
is where you want the generated header files to be placed.
-r
is where you want the generated resource compiler script (.rc file) and the generated binary resource file (.bin)
The node_etw_providerTEMP.bin is a binary resource file that contains the provider and event metadata. This is the
template resource, which is signified by the TEMP suffix of the base name of the file.
And the input is the manifest file (.man). The manifest registers an event producer named NodeJS-ETW-provider
:
<provider name="NodeJS-ETW-provider"
guid="{77754E9B-264B-4D8D-B981-E4135C1ECB0C}"
symbol="NODE_ETW_PROVIDER"
message="$(string.NodeJS-ETW-provider.name)"
resourceFileName="node.exe"
messageFileName="node.exe">
Notice the message
attribute which is used for localization which will be matched with :
<localization>
<resources culture="en-US">
<stringTable>
<string id="NodeJS-ETW-provider.name" value="Node.js ETW Provider"/>
Tasks are typically used to identify major components of the provider, some form of grouping. In node there is one task:
<task name="MethodRuntime" value="1" symbol="JSCRIPT_METHOD_RUNTIME_TASK">
<opcodes>
<opcode name="MethodLoad" value="10" symbol="JSCRIPT_METHOD_METHODLOAD_OPCODE"/>
</opcodes>
</task>
This grouping enables consumers to query only for specific tasks and opcode combinations. The opcodes
are specific to the task MethodRuntime
. But there are also global opcodes:
<opcodes>
<opcode name="NODE_HTTP_SERVER_REQUEST" value="10"/>
<opcode name="NODE_HTTP_SERVER_RESPONSE" value="11"/>
<opcode name="NODE_HTTP_CLIENT_REQUEST" value="12"/>
<opcode name="NODE_HTTP_CLIENT_RESPONSE" value="13"/>
<opcode name="NODE_NET_SERVER_CONNECTION" value="14"/>
<opcode name="NODE_NET_STREAM_END" value="15"/>
<opcode name="NODE_GC_START" value="16"/>
<opcode name="NODE_GC_DONE" value="17"/>
<opcode name="NODE_V8SYMBOL_REMOVE" value="21"/>
<opcode name="NODE_V8SYMBOL_MOVE" value="22"/>
<opcode name="NODE_V8SYMBOL_RESET" value="23"/>
</opcodes>
Templates are used to define event specific data that the provider includes with an event. For example in node one template looks like this:
<template tid="node_connection">
<data name="fd" inType="win:UInt32" />
<data name="port" inType="win:UInt32" />
<data name="remote" inType="win:AnsiString" />
<data name="buffered" inType="win:UInt32" />
</template>
Events have to be defined and can refer to template like the following:
<event value="2"
opcode="NODE_HTTP_SERVER_RESPONSE"
template="node_connection"
symbol="NODE_HTTP_SERVER_RESPONSE_EVENT"
message="$(string.NodeJS-ETW-provider.event.2.message)"
level="win:Informational"/>
Ok, so we can see how the provider and events are configured. This information is used by the mc
tool
to generate headers and a binary file.
The provider headers is in src/node_win32_etw_provider.h. For each of the opcodes there are function declarations
and init_etw() and shutdown_etw(). There is an internal header src/node_win32_etw_provider-inl.h
which includes
the generated node_etw_provider.h
which is located in 'tools/msvs/genfiles/node_etw_provider.h'.
Performace counters
Are used to provide information how node is performing.
There is condition in node.gypi that depends on node_perfctr
which looks like this:
[ 'node_use_perfctr=="true"', {
'defines': [ 'HAVE_PERFCTR=1' ],
'dependencies': [ 'node_perfctr' ],
'sources': [
'src/node_win32_perfctr_provider.h',
'src/node_win32_perfctr_provider.cc',
'src/node_counters.cc',
'src/node_counters.h',
'tools/msvs/genfiles/node_perfctr_provider.rc',
]
} ],
This file is used in the node_perfctr
(node performance counter) target and it is an action target that invokes ctrpp
The input to ctrpp is src/res/node_perfctr_provider.man
{
'action_name': 'node_perfctr_man',
'inputs': [ 'src/res/node_perfctr_provider.man' ],
'outputs': [
'tools/msvs/genfiles/node_perfctr_provider.h',
'tools/msvs/genfiles/node_perfctr_provider.rc',
'tools/msvs/genfiles/MSG00001.BIN',
],
'action': [ 'ctrpp <@(_inputs) '
'-o tools/msvs/genfiles/node_perfctr_provider.h '
'-rc tools/msvs/genfiles/node_perfctr_provider.rc'
]
},
-o
specifies the header that will be generated.
-rc
specifies the file that will be generated.
'src/node_win32_perfctr_provider.cc' includes node_perfctr_provider.h
Notice this this is much like the ETW and manifest based. Lets look at the manifest. Instead of containing events this
file contains counters.
The MSG00001.BIN file is a binary resource file for each language you specify (only one in our case).
But when is the actual .res file created?
I can see these are part of the statically linked library:
dumpbin /archivemembers Release/lib/node.lib
I can see that path is Release\obj\node\node_etw_provider.res and Release\obj\node\node_perfctr_provider.res. Do these files somehow refer to node.lib causing /WHOLEARCHIVE to fail? Is there some way to exclude object from the lib specified by /WHOLEARCHIVE? A .res file is a compiled resource file generated by rc.exe. By default this is created in the same directory as the .rc file.
CTRPP
The CTRPP tool is a pre-processor that parses and validates your counters manifest. The tool also generates code that you use to provide your counter data.
Tracker.exe
Scans .tlog files to figure out what a projects output files are. This is used to know if something should be updated/recompiled I guess. Adding this as you might come across an issue on windows where the last things that is logged is output from the tracker but infact if there is a link error this would not be something that the tracker would report, so you'll need to look further back in the console output.
Inspecting a .lib on windows
DUMPBIN /ARCHIVEMEMBERS Release\lib\node.lib
Linux Trace Toolkit: next generation (lttng)
LTTng is an open source tracing framework for Linux.
Does it make sense to have the node.res file when node is linked as a static library? Would'nt the case be that the icon and other resource info be that of the embedders.
But out about the ETW and performance counters resources? Resources (as far as I can tell) are intended to be linked to the exe or a dynamic library. It does not seem like it is possible to add
Node executable linking
By default, just building node using ./configure && make -j8 the node executable will be statically linked:
$ otool -L out/Debug/node
out/Debug/node:
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1259.11.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.1.0)
GYP
For addon tests in node there can be the need to link against a library in the root directory. The problem is that you cannot use <(PRODUCT_DIR) as
that is the product dir for the addon itself. There is also the issue that the output directory is different on windows, it is not out
but instead
the Release
and Debug
directories are in the root of the project. But there are variables available by the respective node-gyp evironments,
for example on Window you can use:
['OS=="win"', {
'libraries': ['../../../../$(Configuration)/lib/zlib.lib'],
}, {
'libraries': ['../../../../out/$(BUILDTYPE)/libzlib.a'],
}],
serdes
This is a built in module for serialization/deserialization which is currently marked as experimental. It allows for serializing JavaScript values in a way compatiable with https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, which an algorithm for copying complex JavaScript objects and used internally for transferring data to and from Web Workers view postMessage().
Failing test without network connection
make[1]: Leaving directory `/root/rpmbuild/BUILD/node-v8.7.0'
/usr/bin/python2.7 tools/test.py --mode=release -J \
async-hooks \
abort doctool es-module inspector known_issues message parallel pseudo-tty sequential \
addons addons-napi
=== release test-dgram-membership ===
Path: parallel/test-dgram-membership
assert.js:41
throw new errors.AssertionError({
^
AssertionError [ERR_ASSERTION]: Got unwanted exception.
at innerThrows (assert.js:653:7)
at Function.doesNotThrow (assert.js:665:3)
at Object.<anonymous> (/root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-membership.js:83:10)
at Module._compile (module.js:624:30)
at Object.Module._extensions..js (module.js:635:10)
at Module.load (module.js:545:32)
at tryModuleLoad (module.js:508:12)
at Function.Module._load (module.js:500:3)
at Function.Module.runMain (module.js:665:10)
at startup (bootstrap_node.js:187:16)
Command: out/Release/node /root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-membership.js
=== release test-dgram-multicast-set-interface-lo ===
Path: parallel/test-dgram-multicast-set-interface-lo
assert.js:41
throw new errors.AssertionError({
^
AssertionError [ERR_ASSERTION]: undefined == true
at Object.<anonymous> (/root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-multicast-set-interface-lo.js:64:8)
at Module._compile (module.js:624:30)
at Object.Module._extensions..js (module.js:635:10)
at Module.load (module.js:545:32)
at tryModuleLoad (module.js:508:12)
at Function.Module._load (module.js:500:3)
at Function.Module.runMain (module.js:665:10)
at startup (bootstrap_node.js:187:16)
at bootstrap_node.js:608:3
Command: out/Release/node /root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-multicast-set-interface-lo.js
[03:09|% 100|+ 1893|- 2]: Done
Trouble shooting DNS issue on RHEL
BIND (the NameServer) consists of a set of dns related programs. The nameserver itself is called named. There is an admin tool named rdnc and dig or debugging. named reads it's configuration from /etc/named.conf.
/etc/nsswitch.conf
Before this configuration file existed name service lookups were hardcoded into the c library, as well as the search order. Name Service Switch was introduced by Sun Microsystems in there C library implementation in Solaris 2. This uses modules which allowes for adding new services without adding them to the GNU C library.
‘unavail’ The service is permanently unavailable. This can either mean the needed file is not available, or, for DNS, the server is not available or does not allow queries. The default action is continue.
For the hosts and networks databases the default value is dns [!UNAVAIL=return] files. I.e., the system is prepared for the DNS service not to be available but if it is available the answer it returns is definitive.
Regarding test/parallel/test-net-connect-immediate-finish.js and test/parallel/test-net-better-error-messages-port-hostname.js the and the error:
Path: parallel/test-net-connect-immediate-finish
assert.js:41
throw new errors.AssertionError({
^
AssertionError [ERR_ASSERTION]: 'EAI_AGAIN' === 'ENOTFOUND'
at Socket.client.once.common.mustCall (/builddir/build/BUILD/node-v8.6.0/test/parallel/test-net-connect-immediate-finish.js:35:10)
at Socket.<anonymous> (/builddir/build/BUILD/node-v8.6.0/test/common/index.js:514:15)
at Object.onceWrapper (events.js:316:30)
at emitOne (events.js:115:13)
at Socket.emit (events.js:210:7)
at emitErrorNT (internal/streams/destroy.js:64:8)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
Command: out/Release/node /builddir/build/BUILD/node-v8.6.0/test/parallel/test-net-connect-immediate-finish.js
This only occurs when there is no available dns (you might need to comment it out in /etc/resolve.conf to reproduce this), I manually update /etc/nsswitch.conf and the update the hosts entry:
hosts: files dns
Notice that the last entry is dns which so it will return the result which will be Using the default (at least this is the default for the RHEL version that we are using in our image: registry.access.redhat.com/rhel7) this error does not occur as the default is:
hosts: files dns myhostname
Since this is an external configuration that could be different for different installations I think the best option is to update the test to be able to handle this situation. I'll create a pull request and see what people think.
Inspector console
In the start function of lib/internal/bootstrap_node.js we have a function that sets up the inspector console. After the global console object is set up, it is passed to setupInspector(originalConsole, wrappedConsole, Module); It does so with the help of a builtin module 'inspector' which can be found in src/inspector_js_api.cc.
const { addCommandLineAPI, consoleCall } = process.binding('inspector');
wrappedConsole[key] = consoleCall.bind(wrappedConsole,
originalConsole[key],
wrappedConsole[key],
config);
The above will create a new function which has its this
set to wrappedConsole
$ out/Debug/node
[Function: log] [Function: bound log] {}
[Function: info] [Function: bound log] {}
[Function: warn] [Function: bound warn] {}
[Function: error] [Function: bound warn] {}
[Function: dir] [Function: bound dir] {}
[Function: time] [Function: bound time] {}
[Function: timeEnd] [Function: bound timeEnd] {}
[Function: trace] [Function: bound trace] {}
[Function: assert] [Function: bound assert] {}
[Function: clear] [Function: bound clear] {}
[Function: count] [Function: bound count] {}
[Function: group] [Function: bound group] {}
[Function: groupCollapsed] [Function: bound group] {}
[Function: groupEnd] [Function: bound groupEnd] {}
So for all of the above functions they will now got through the ConsoleCall function in inspector_agent.cc which will check if the inspector is enabled, which is done by passing --inspect, for this process:
if (env->inspector_agent()->enabled()) {
...
}
In this case the writing to stdout will be performed twice, once to the inspector using:
inspector_method.As<Function>()->Call(context,
info.Holder(),
call_args.size(),
call_args.data()).IsEmpty());
For example you'll seen the output in both stdout and in chrome's console.
ccache
$ git clone https://github.com/ccache/ccache.git
$ cd ccache
$ ./autogen.sh
$ ./configure && make
Add ccache to your PATH.
export CC="ccache clang -Qunused-arguments"
export CXX="ccache clang++ -Qunused-arguments"
Now you can configure and build as normal.
To see what is in the cache:
$ ccache -s
To clear the cache:
$ ccache -C
GetPeerCertificate
STACK_OF(X509)* ssl_certs = SSL_get_peer_cert_chain(w->ssl_);
STACK_OF is a macro defined in deps/openssl/openssl/crypto/stack/safestack.h
and will expand to:
struct stack_st_X509* ssl_certs = SSL_get_peer_cert_chain(w->ssl_);
A Local<Object> instance holds a pointer to an Object. If you pass this instance to a function a copy of the object will be created. But both instances will point to the same object (the pointer is copied).
For example:
static void AddIssuer(X509** cert,
const STACK_OF(X509)* const peer_certs,
Local<Object> info,
Environment* const env) {
...
Local<Object> ca_info = X509ToObject(env, ca);
// Now we want to update what info points to so that is points to the value of ca_info instead.
(lldb) expr info
(v8::Local<v8::Object>) $4 = (val_ = 0x0000000106014b08)
(lldb) expr *info
(v8::Object *) $6 = 0x0000000106014b08
The copy constructor for Local<> will copy the value, which includes the pointer so these are separate objects:
(lldb) p &result
(v8::Local<v8::Object> *) $86 = 0x00007fff5fbf04b0
(lldb) p &info
(v8::Local<v8::Object> *) $87 = 0x00007fff5fbf04a8
But what is info supposed to represent? Well they both initially point to the same thing as we can see but as mentioned they are separate objects. So the following will update what they both (result and info) point to:
Local<Object> ca_info = X509ToObject(env, ca);
info->Set(env->issuercert_string(), ca_info);
But the following will cause info to copy constructed (I think) to ca_info so the connection with result is lost here. That is incorrect! What is happening is that the first time info == result so the issuer is set on it, and it contains the ca_info instance. So result can get to it. Next when info is set to ca_info:
info = ca_info;
This is for the next iteration and if there are more they will be chained to ca_info using the info->Set(env->issuercert_string(), ca_info), which remember can be accessed from result via its issuercert_string property.
result -> #issuercertificate -> ca_info -> #issuercerficate -> ca_info
If this is not done result will not be linked to them all and the chain broken.
Crypto SetKey
void DiffieHellman::SetPublicKey(const FunctionCallbackInfo<Value>& args) {
SetKey(args, [](DH* dh, BIGNUM* num) { DH_set0_key(dh, num, nullptr); },
"Public key");
}
void DiffieHellman::SetPrivateKey(const FunctionCallbackInfo<Value>& args) {
#if OPENSSL_VERSION_NUMBER >= 0x10100000L && OPENSSL_VERSION_NUMBER < 0x10100070L
// Older versions of OpenSSL 1.1.0 have a DH_set0_key which does not work for
// Node. See https://github.com/openssl/openssl/pull/4384.
#error "OpenSSL 1.1.0 revisions before 1.1.0g are not supported"
#endif
SetKey(args, [](DH* dh, BIGNUM* num) { DH_set0_key(dh, nullptr, num); },
"Private key");
}
void DiffieHellman::SetKey(const v8::FunctionCallbackInfo<v8::Value>& args,
void (*set_field)(DH*, BIGNUM*), const char* what) {
...
Notice that the second paramenter to SetKey is a function pointer:
void function_name(DH* dh, BIGHUM* n);
I noticed that DH_set0_key returns one and wondering why as the function pointer is declared as returning void:
static int DH_set0_key(DH* dh, BIGNUM* pub_key, BIGNUM* priv_key) {
if (pub_key != nullptr) {
BN_free(dh->pub_key);
dh->pub_key = pub_key;
}
if (priv_key != nullptr) {
BN_free(dh->priv_key);
dh->priv_key = priv_key;
}
return 1;
}
And the call looks like this in SetKey
:
set_field(dh->dh, num);
Object Set
The Set
functions in include/v8.h for Object are deprecated. The ones that should be used are the ones that
return a MayBe<bool> result instead. This means that you also have to call either FromJust() or ToChecked()
before using/retrieving the value.
3109 class V8_EXPORT Object : public Value {
3110 public:
3111 V8_DEPRECATE_SOON("Use maybe version",
3112 bool Set(Local<Value> key, Local<Value> value));
3113 V8_WARN_UNUSED_RESULT Maybe<bool> Set(Local<Context> context,
3114 Local<Value> key, Local<Value> value);
3115
3116 V8_DEPRECATE_SOON("Use maybe version",
3117 bool Set(uint32_t index, Local<Value> value));
3118 V8_WARN_UNUSED_RESULT Maybe<bool> Set(Local<Context> context, uint32_t index,
3119 Local<Value> value);
Unqualified name lookup
void SecureContext::Initialize(Environment* env, Local<Object> target) {
...
env->SetProtoMethod(t, "init", SecureContext::Init);
Even without the SecureContext
namespace Init
will be resolved correctly as resolution will
look in class of the member funtion which is SecureContext.
const binding = process.binding('crypto');
This will invoke SecureContext::Initialize.
const server = tls.Server(options, function(socket) {
exports.Server = require('_tls_wrap').Server;
_tls_common.js:
const binding = process.binding('crypto');
const NativeSecureContext = binding.SecureContext;
function SecureContext(secureProtocol, secureOptions, context) {
...
this.context = new NativeSecureContext();
}
So we can see that SecureContext::New is bound to NativeSecureContext.
node_crypto_bio
static const BIO_METHOD method = {
BIO_TYPE_MEM,
"node.js SSL buffer",
Write,
Read,
Puts,
Gets,
Ctrl,
New,
Free,
nullptr
};
BIO* NodeBIO::NewFixed(const char* data, size_t len) {
BIO* bio = New();
if (bio == nullptr ||
len > INT_MAX ||
BIO_write(bio, data, len) != static_cast<int>(len) ||
BIO_set_mem_eof_return(bio, 0) != 1) {
BIO_free(bio);
return nullptr;
}
return bio;
}
BIO_set_mem_eof_return
is a macro defined as:
# define BIO_set_mem_eof_return(b,v) BIO_ctrl(b,BIO_C_SET_BUF_MEM_EOF_RETURN,v,NULL)
which in our case would give:
BIO_strl(bio, BIO_C_SET_BUF_MEM_EOF_RETURN, 0, NULL)
This call will end up in bio_lib.c and its BIO_ctrl function:
ret = b->method->ctrl(b, cmd, larg, parg);
Now, b is the BIO we passed in, cmd is BIO_C_SET_BUF_MEM_EOF_RETURN (130), larg is 0, and parg is NULL. The interesting thing here is that b->method will be the method defined in node_crypto_bio.cc:
(lldb) p *b->method
(BIO_METHOD) $12 = {
type = 1025
name = 0x0000000101d44053 "node.js SSL buffer"
bwrite = 0x000000010192ed10 (node`node::crypto::NodeBIO::Write(bio_st*, char const*, int) at node_crypto_bio.cc:142)
bread = 0x000000010192e940 (node`node::crypto::NodeBIO::Read(bio_st*, char*, int) at node_crypto_bio.cc:92)
bputs = 0x000000010192ef90 (node`node::crypto::NodeBIO::Puts(bio_st*, char const*) at node_crypto_bio.cc:151)
bgets = 0x000000010192efe0 (node`node::crypto::NodeBIO::Gets(bio_st*, char*, int) at node_crypto_bio.cc:156)
ctrl = 0x000000010192f2f0 (node`node::crypto::NodeBIO::Ctrl(bio_st*, int, long, void*) at node_crypto_bio.cc:182)
create = 0x000000010192e7c0 (node`node::crypto::NodeBIO::New(bio_st*) at node_crypto_bio.cc:68)
destroy = 0x000000010192e830 (node`node::crypto::NodeBIO::Free(bio_st*) at node_crypto_bio.cc:77)
callback_ctrl = 0x0000000000000000
}
So b->method->ctrl will call NodeBIO::Ctrl: The first things that is done is that the NodeBIO instance is retrieved from the bio:
NodeBIO* nbio = FromBIO(bio);
'FromBIO':
CHECK_NE(BIO_get_data(bio), nullptr);
return static_cast<NodeBIO*>(BIO_get_data(bio));
BIO_get_data is a macro for OPENSSL versions greater than 1.0.1:
#define BIO_get_data(bio) bio->ptr
So we can see that the BIO's ponter is a pointer to the NodeBIO instance.
case BIO_C_SET_BUF_MEM_EOF_RETURN
nbio->set_eof_return(num);
break;
So we are calling 'set_eof_return' with
(lldb) p num
(long) $14 = 0
Node Crypto BIO Buffer
node_crypto_bio.h has a private Buffer class.
To understand what this buffer does and how it works lets take a look at a usage of it... SecureContext::AddCACert:
...
BIO* bio = LoadBIO(env, args[0]);
In this case args[0] is a certificate (a string) so the following path in LoadBIO will be taken:
return NodeBIO::NewFixed(*s, s.length());
NodeBIO::NewFixed will call:
if (bio == nullptr ||
len > INT_MAX ||
BIO_write(bio, data, len) != static_cast<int>(len) ||
BIO_set_mem_eof_return(bio, 0) != 1) {
BIO_free(bio);
return nullptr;
}
We are interested in the BIO_write call which will call Write on the bio's method which is NodeBIO::Write:
void NodeBIO::Write(const char* data, size_t size) {
...
// Allocate initial buffer if the ring is empty
TryAllocateForWrite(left);
void NodeBIO::TryAllocateForWrite(size_t hint) {
Buffer* w = write_head_;
Buffer* r = read_head_;
// If write head is full, next buffer is either read head or not empty.
if (w == nullptr ||
(w->write_pos_ == w->len_ &&
(w->next_ == r || w->next_->write_pos_ != 0))) {
size_t len = w == nullptr ? initial_ : kThroughputBufferLength;
if (len < hint)
len = hint;
Buffer* next = new Buffer(env_, len);
if (w == nullptr) {
next->next_ = next;
write_head_ = next;
read_head_ = next;
} else {
next->next_ = w->next_;
w->next_ = next;
}
}
}
First time entering this function w and r will be null as write_head_ and read_head_ will be null:
(lldb) expr w
(node::crypto::NodeBIO::Buffer *) $24 = 0x0000000000000000
(lldb) expr r
(node::crypto::NodeBIO::Buffer *) $25 = 0x0000000000000000
size_t len = initial_; // since w == nullptr;
(lldb) p len
(size_t) $28 = 1024
(lldb) p hint
(size_t) $29 = 1224
if (len < hint) // 1024 < 1224
len = 1224
Buffer* next = new Buffer(env_, 1224);
Buffer constructor:
data_ = new char[len];
if (w == nullptr) {
next->next_ = next;
write_head_ = next;
read_head_ = next;
}
So initially there is only a single entry all pointing to this first one. Next, we are back in NodeBIO::Write:
while (left > 0) { // first time left is == 1224
....
memcpy(write_head_->data_ + write_head_->write_pos_, data + offset, to_write);
void* memcpy( void* dest, const void* src, std::size_t count ) so the destination in our case is write_head_data_ + write_head_write_pos_, the data to be copied is data + 0, and to_write is 1224.
ClientHelloParser
node_crypto.h has a class member that is declared like:
ClientHelloParser hello_parser_;
So when a new SSLWrap is created constructor of ClientHelloParser will be called. Looking at the constructor for ClientHelloParser it first initilizes its fields and then calls Reset():
ClientHelloParser() : state_(kEnded),
onhello_cb_(nullptr),
onend_cb_(nullptr),
cb_arg_(nullptr),
session_size_(0),
session_id_(nullptr),
servername_size_(0),
servername_(nullptr),
ocsp_request_(0),
tls_ticket_size_(0),
tls_ticket_(nullptr) {
Reset();
}
inline void ClientHelloParser::Reset() {
frame_len_ = 0;
body_offset_ = 0;
extension_offset_ = 0;
session_size_ = 0;
session_id_ = nullptr;
tls_ticket_size_ = -1;
tls_ticket_ = nullptr;
servername_size_ = 0;
servername_ = nullptr;
ocsp_request_ = 0; // added by me
}
I can't find a reason for not resetting ocsp_request_ here. I'll create a PR to get some feedback.
v8::Object::Set and exceptions
This section goes through a call to Set and the possible exceptions that might be throws and how they can be handled.
$ lldb -- out/Debug/node --inspect-brk test/parallel/test-tls-legacy-onselect.js
env->SetProtoMethod(t, "setSNICallback", Connection::SetSNICallback);
(lldb) br s -f node_crypto.cc -l 3594
(lldb) r
Open chrome://inspect and set a break point on this line:
const pair = tls.createSecurePair(null, true, false, false);
A SecurePair
is a pair of streams to do encrypted communication with.
Local<Object> obj = Object::New(env->isolate());
obj->Set(env->context(), env->onselect_string(), args[0]).FromJust();
We are creating a new Localv8::Object and setting a property on that object. The property name is taken from the Environment function onselect_string(). This function is generated by a macro and by:
V(onselect_string, "onselect")
Now, obj->Set
is only setting a property to be a function and here is a check in this
specific code path that arg[0] exists and is a function.
(lldb) jlh obj
0x10583e1675b9: [JS_OBJECT_TYPE]
- map = 0x105859b5fe79 [FastProperties]
- prototype = 0x105819c04679
- elements = 0x1058b5982241 <FixedArray[0]> [HOLEY_ELEMENTS]
- properties = 0x1058b5982241 <FixedArray[0]> {
#onselect: 0x10583e167571 <JSFunction (sfi = 0x1058fc5f3969)> (const data descriptor)
}
But, there is a chance where an exception might be thrown and that is if there is a setter for `onselect'. This migth look like:
Object.defineProperty(Object.prototype, 'onselect', {
set: function(f) {
console.log('throw error from setter...');
throw Error('dummy setter error');
}
});
Calling obj->Set(env->context(), env->onselect_string(), args[0]).FromJust();
would cause an exception to be
thrown:
$ out/Debug/node test/parallel/test-tls-legacy-onselect.js
throw error from setter...
FATAL ERROR: v8::FromJust Maybe value is Nothing.
1: node::Abort() [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
2: node::OnFatalError(char const*, char const*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
3: v8::Utils::ReportApiFailure(char const*, char const*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
4: v8::Utils::ApiCheck(bool, char const*, char const*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
5: v8::V8::FromJustIsNothing() [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
6: v8::Maybe<bool>::FromJust() const [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
7: node::crypto::Connection::SetSNICallback(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
8: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
9: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
10: v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
11: v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
12: 0x10e0737043c4
13: 0x10e0737f1600
Abort trap: 6
Notice the FromJust()
which will crash if there is an error in the JavaScript setter.
But instead what should happen is (in SetSNICallback):
if (obj->Set(env->context(), env->onselect_string(), args[0]).IsNothing()) {
return;
}
return
where will return to api-arguments.cc
line 26:
return GetReturnValue<Object>(isolate);
GetReturnValue
can be found in api-arguments.h'. It looks like this will see if there is a return value and return a Handle to it or an empty Handle. After returning from
v8::internal::FunctionCallbackArguments::Call` we are the builtins-api.cc HandleApiCallHelper:
Handle<Object> result = custom.Call(callback); // this was our call to SetSNICallback
RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, Object);
RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION
is a macro:
#define RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, T) \
RETURN_VALUE_IF_SCHEDULED_EXCEPTION(isolate, MaybeHandle<T>())
Which will expend to
RETURN_VALUE_IF_SCHEDULED_EXCEPTION(isolate, MaybeHandle<Object>())
#define RETURN_VALUE_IF_SCHEDULED_EXCEPTION(isolate, value) \
do { \
Isolate* __isolate__ = (isolate); \
DCHECK(!__isolate__->has_pending_exception()); \
if (__isolate__->has_scheduled_exception()) { \
__isolate__->PromoteScheduledException(); \
return value; \
} \
} while (false)
So this will expand to (which will be places in HandleApiCallHelper replacing RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION) as:
Isolate* __isolate__ = (isolate);
DCHECK(!__isolate__->has_pending_exception());
if (__isolate__->has_scheduled_exception()) {
__isolate__->PromoteScheduledException();
return MaybeHandle<Object();
}
In this case there will be a pending exception so isolate->PromoteScheduledException will be called.
PromoteScheduledException:
Object* thrown = scheduled_exception();
clear_scheduled_exception();
return ReThrow(thrown);
(lldb) job thrown
0x11728e9a5401: [JS_ERROR_TYPE]
- map = 0x1172bc86cf79 [FastProperties]
- prototype = 0x117204d0d679
- elements = 0x1172afa82241 <FixedArray[0]> [HOLEY_SMI_ELEMENTS]
- properties = 0x11728e9a5469 <PropertyArray[3]> {
#stack: 0x1172b59ca9b9 <AccessorInfo> (const accessor descriptor)
#message: 0x1172c10b6711 <String[18]: dummy setter error> (data field 0) properties[0]
0x1172afa86899 <Symbol: detailed_stack_trace_symbol>: 0x11728e9a5491 <FixedArray[5]> (data field 1) properties[1]
0x1172afa87069 <Symbol: stack_trace_symbol>: 0x11728e9a5cf1 <JSArray[26]> (data field 2) properties[2]
}
ReThrow:
set_pending_exception(exception);
return heap()->exception();
void Isolate::set_pending_exception(Object* exception_obj) {
DCHECK(!exception_obj->IsException(this));
thread_local_top_.pending_exception_ = exception_obj;
}
Now when returning we will land in bootstrap_node.js
and process_fatalException
callback:
process._fatalException = function(er) {
...
if (!caught)
caught = process.emit('uncaughtException', er);
// If someone handled it, then great. otherwise, die in C++ land
// since that means that we'll exit the process, emit the 'exit' event
if (!caught) {
try {
if (!process._exiting) {
process._exiting = true;
process.emit('exit', 1);
}
} catch (er) {
// nothing to be done about it at this point.
}
} else {
...
}
return caught;
}
Where er will be:
er = Error: dummy setter error at Object.set
The error would be:
throw error from setter...
/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-legacy-onselect.js:13
throw Error('dummy setter error');
^
Error: dummy setter error
at Object.set (/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-legacy-onselect.js:13:11)
at Server.<anonymous> (/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-legacy-onselect.js:20:12)
at Server.<anonymous> (/Users/danielbevenius/work/nodejs/node/test/common/index.js:520:15)
at Server.emit (events.js:126:13)
at TCP.onconnection (net.js:1595:8)
This is more informative about the error and easier to understand where it happend.
v8::Object::Set walkthrough
This function can be found in api.cc
. In the walkthrough what we are setting is a callback on an object.
Maybe<bool> v8::Object::Set(v8::Local<v8::Context> context, v8::Local<Value> key, v8::Local<Value> value) {
auto isolate = reinterpret_cast<i::Isolate*>(context->GetIsolate());
ENTER_V8(isolate, context, Object, Set, Nothing<bool>(), i::HandleScope);
...
}
In this walk through the arguments are:
(lldb) jlh key
#onselect
(lldb) jlh value
0x2cfe064e7d11: [Function]
- map = 0x2cfea4f02411 [FastProperties]
- prototype = 0x2cfec3204631
- elements = 0x2cfea2482241 <FixedArray[0]> [HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0x2cfec4a73ae1 <SharedFunctionInfo>
- name = 0x2cfea2482431 <String[0]: >
- formal_parameter_count = 0
- kind = [ NormalFunction ]
- context = 0x2cfe064e1be1 <FixedArray[5]>
- code = 0x6b17cf07d01 <Code BUILTIN>
- source code = () {
context.actual++;
return fn.apply(this, arguments);
}
- properties = 0x2cfea2482241 <FixedArray[0]> {
#length: 0x2cfed24ca4f1 <AccessorInfo> (const accessor descriptor)
#name: 0x2cfed24ca561 <AccessorInfo> (const accessor descriptor)
#prototype: 0x2cfed24ca5d1 <AccessorInfo> (const accessor descriptor)
}
- feedback vector: 0x2cfe7b857fa9: [FeedbackVector] in OldSpace
- length: 9
SharedFunctionInfo: 0x2cfec4a73ae1 <SharedFunctionInfo>
Optimized Code: 0
Invocation Count: 1
Profiler Ticks: 0
Slot #0 LoadProperty PREMONOMORPHIC
[0]: 0x2cfea2486d59 <Symbol: premonomorphic_symbol>
[1]: 0x2cfea24871c1 <Symbol: uninitialized_symbol>
Slot #2 StoreNamedStrict PREMONOMORPHIC
[2]: 0x2cfea2486d59 <Symbol: premonomorphic_symbol>
[3]: 0x2cfea24871c1 <Symbol: uninitialized_symbol>
Slot #4 BinaryOp MONOMORPHIC
[4]: 1
Slot #5 Call MONOMORPHIC
[5]: 0x2cfe7b858019 <WeakCell value= 0x2cfec32088c1 <JSFunction apply (sfi = 0x2cfea24b2181)>>
[6]: 1
Slot #7 LoadProperty PREMONOMORPHIC
[7]: 0x2cfea2486d59 <Symbol: premonomorphic_symbol>
[8]: 0x2cfea24871c1 <Symbol: uninitialized_symbol>
This is the callback passed in :
pair.ssl.setSNICallback(common.mustCall(function() {
raw.destroy();
server.close();
}));
ENTER_V8
is a macro which is defined as:
#define ENTER_V8(isolate, context, class_name, function_name, bailout_value, \
HandleScopeClass) \
ENTER_V8_HELPER_DO_NOT_USE(isolate, context, class_name, function_name, \
bailout_value, HandleScopeClass, true)
So ENTER_V8(isolate, context, Object, Set, Nothing<bool>(), i::HandleScope);
would expend to:
ENTER_V8_HELPER_DO_NOT_USE(isolate, context, Object, Set, Nothing<bool>(), i::HandleScope, true)
#define ENTER_V8_HELPER_DO_NOT_USE(isolate, context, Object, \
function_name, bailout_value, \
HandleScopeClass, do_callback) \
if (IsExecutionTerminatingCheck(isolate)) { \
return bailout_value; \
} \
HandleScopeClass handle_scope(isolate); \
CallDepthScope<do_callback> call_depth_scope(isolate, context); \
LOG_API(isolate, class_name, function_name); \
i::VMState<v8::OTHER> __state__((isolate)); \
bool has_pending_exception = false
So back in our v8::Object::Set function these macros would expend to:
Maybe<bool> v8::Object::Set(v8::Local<v8::Context> context, v8::Local<Value> key, v8::Local<Value> value) {
auto isolate = reinterpret_cast<i::Isolate*>(context->GetIsolate());
if (IsExecutionTerminatingCheck(isolate)) {
return Nothing<bool>();
}
HandleScopeClass handle_scope(isolate);
CallDepthScope<do_callback> call_depth_scope(isolate, context);
LOG_API(isolate, Object, Set);
i::VMState<v8::OTHER> __state__((isolate));
bool has_pending_exception = false;
auto self = Utils::OpenHandle(this);
auto key_obj = Utils::OpenHandle(reinterpret_cast<Name*>(*key));
auto value_obj = Utils::OpenHandle(*value);
has_pending_exception = i::Runtime::SetObjectProperty(isolate,
self,
key_obj,
value_obj,
i::LanguageMode::kSloppy).is_null();
RETURN_ON_FAILED_EXECUTION_PRIMITIVE(bool);
return Just(true);
}
So, lets take a closer look at i::Runtime::SetObjectProperty (i
is V8's internal namespace) which we can find in
src/runtime/runtime-object.cc.
// Check if the given key is an array index.
bool success = false;
LookupIterator it = LookupIterator::PropertyOrElement(isolate, object, key, &success);
if (!success) return MaybeHandle<Object>();
MAYBE_RETURN_NULL(Object::SetProperty(&it, value, language_mode,
Object::MAY_BE_STORE_FROM_KEYED));
return value;
MAYBE_RETURN_NULL
macro :
#define MAYBE_RETURN_NULL(call) MAYBE_RETURN(call, MaybeHandle<Object>())
#define MAYBE_RETURN(call, value) \
do { \
if ((call).IsNothing()) return value; \
} while (false)
So, in this case our call will expend to:
if (Object::SetProperty(&it, value, language_mode, Object::MAY_BE_STORE_FROM_KEYED).IsNothing())
return MaybeHandle<Object>();
Lets follow SetProperty
(in src/objects.cc line 4753) and see what it does:
Maybe<bool> Object::SetProperty(LookupIterator* it, Handle<Object> value,
LanguageMode language_mode,
StoreFromKeyed store_mode) {
if (it->IsFound()) {
bool found = true;
Maybe<bool> result =
SetPropertyInternal(it, value, language_mode, store_mode, &found);
if (found) return result;
}
// If the receiver is the JSGlobalObject, the store was contextual. In case
// the property did not exist yet on the global object itself, we have to
// throw a reference error in strict mode. In sloppy mode, we continue.
if (is_strict(language_mode) && it->GetReceiver()->IsJSGlobalObject()) {
it->isolate()->Throw(*it->isolate()->factory()->NewReferenceError(
MessageTemplate::kNotDefined, it->name()));
return Nothing<bool>();
}
ShouldThrow should_throw =
is_sloppy(language_mode) ? kDontThrow : kThrowOnError;
return AddDataProperty(it, value, NONE, should_throw, store_mode);
}
SetPropertyInternal
(in src/objects.cc):
ShouldThrow should_throw = is_sloppy(language_mode) ? DONT_THROW : THROW_ON_ERROR;
switch (it->state()) {
...
}
lldb) p it->state()
(v8::internal::LookupIterator::State) $22 = ACCESSOR
(lldb) expr should_throw
(ShouldThrow) $28 = DONT_THROW
....
return SetPropertyWithAccessor(it, value, should_throw);
SetPropertyWithAccessors
:
Handle<Object> structure = it->GetAccessors();
Handle<Object> receiver = it->GetReceiver();
...
Handle<Object> setter(AccessorPair::cast(*structure)->setter(), isolate);
(lldb) job *setter
0x2cfe064bd1b1: [Function]
- map = 0x2cfea4f02411 [FastProperties]
- prototype = 0x2cfec3204631
- elements = 0x2cfea2482241 <FixedArray[0]> [HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0x2cfe232eed51 <SharedFunctionInfo set>
- name = 0x2cfea2485f41 <String[3]: set>
- formal_parameter_count = 1
- kind = [ NormalFunction ]
- context = 0x2cfedc582a29 <FixedArray[8]>
- code = 0x6b17cf07d01 <Code BUILTIN>
- source code = (f) {
console.log('throw error from setter...');
throw Error('dummy setter error');
}
- properties = 0x2cfea2482241 <FixedArray[0]> {
#length: 0x2cfed24ca4f1 <AccessorInfo> (const accessor descriptor)
#name: 0x2cfed24ca561 <AccessorInfo> (const accessor descriptor)
#prototype: 0x2cfed24ca5d1 <AccessorInfo> (const accessor descriptor)
}
- feedback vector: not available
We can see that this is our setter.
So the next line of interest is:
return SetPropertyWithDefinedSetter(receiver, Handle<JSReceiver>::cast(setter), value, should_throw);
SetPropertyWithDefinedSetter
RETURN_ON_EXCEPTION_VALUE(isolate, Execution::Call(isolate, setter, receiver,
arraysize(argv), argv),
Nothing<bool>());
return Just(true);
RETURN_ON_EXCEPTION_VALUE
checks if Call
returned null and if so returns Nothing<bool>()
.
So lets follow Call
(src/execution.cc line 191):
return CallInternal(isolate, callable, receiver, argc, argv, MessageHandling::kReport);
MessageHandling
is an enum define as:
enum class MessageHandling { kReport, kKeepPending };
CallInternal
:
return Invoke(isolate, false, callable, receiver, argc, argv,
isolate->factory()->undefined_value(), message_handling);
The second argument name is is_construct
and undefined is passed for new_target
.
Invoke
(src/execution.cc line 58):
MUST_USE_RESULT MaybeHandle<Object> Invoke(
Isolate* isolate, bool is_construct, Handle<Object> target,
Handle<Object> receiver, int argc, Handle<Object> args[],
Handle<Object> new_target, Execution::MessageHandling message_handling) {
...
typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
Object* receiver, int argc,
Object*** args);
...
Object* value = NULL;
JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
Handle<Code> code = is_construct
? isolate->factory()->js_construct_entry_code()
: isolate->factory()->js_entry_code();
(lldb) p is_construct
(bool) $74 = false
So isolate->factory()->js_entry_code()
will be called. Lets take a look at this code. To do this we have to
take the address from code->entry()
(lldb) job *code
0x6b17cf04001: [Code]
kind = STUB
major_key = JSEntryStub
compiler = unknown
Instructions (size = 234)
0x6b17cf04060 0 55 push rbp
0x6b17cf04061 1 4889e5 REX.W movq rbp,rsp
0x6b17cf04064 4 6a02 push 0x2
0x6b17cf04066 6 49ba7819000601000000 REX.W movq r10,0x106001978 ;; external reference (Isolate::context_address)
0x6b17cf04070 10 4d8b12 REX.W movq r10,[r10]
0x6b17cf04073 13 4152 push r10
0x6b17cf04075 15 4154 push r12
0x6b17cf04077 17 4155 push r13
0x6b17cf04079 19 4156 push r14
0x6b17cf0407b 1b 4157 push r15
0x6b17cf0407d 1d 53 push rbx
0x6b17cf0407e 1e 49bd4800000601000000 REX.W movq r13,0x106000048 ;; external reference (Heap::roots_array_start())
0x6b17cf04088 28 4981c580000000 REX.W addq r13,0x80
0x6b17cf0408f 2f 49bae819000601000000 REX.W movq r10,0x1060019e8 ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04099 39 41ff32 push [r10]
0x6b17cf0409c 3c 48a1081a000601000000 REX.W movq rax,(0x106001a08) ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf040a6 46 4885c0 REX.W testq rax,rax
0x6b17cf040a9 49 0f8514000000 jnz 0x6b17cf040c3 <+0x63>
0x6b17cf040af 4f 6a02 push 0x2
0x6b17cf040b1 51 488bc5 REX.W movq rax,rbp
0x6b17cf040b4 54 48a3081a000601000000 REX.W movq (0x106001a08),rax ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf040be 5e e902000000 jmp 0x6b17cf040c5 <+0x65>
0x6b17cf040c3 63 6a00 push 0x0
0x6b17cf040c5 65 e916000000 jmp 0x6b17cf040e0 <+0x80>
0x6b17cf040ca 6a 48a38819000601000000 REX.W movq (0x106001988),rax ;; external reference (Isolate::pending_exception_address)
0x6b17cf040d4 74 498b8588000000 REX.W movq rax,[r13+0x88]
0x6b17cf040db 7b e932000000 jmp 0x6b17cf04112 <+0xb2>
0x6b17cf040e0 80 49baf019000601000000 REX.W movq r10,0x1060019f0 ;; external reference (Isolate::handler_address)
0x6b17cf040ea 8a 41ff32 push [r10]
0x6b17cf040ed 8d 49baf019000601000000 REX.W movq r10,0x1060019f0 ;; external reference (Isolate::handler_address)
0x6b17cf040f7 97 498922 REX.W movq [r10],rsp
0x6b17cf040fa 9a 6a00 push 0x0
0x6b17cf040fc 9c e8bf000000 call 0x6b17cf041c0 (JSEntryTrampoline) ;; code: BUILTIN
0x6b17cf04101 a1 49baf019000601000000 REX.W movq r10,0x1060019f0 ;; external reference (Isolate::handler_address)
0x6b17cf0410b ab 418f02 pop [r10]
0x6b17cf0410e ae 4883c400 REX.W addq rsp,0x0
0x6b17cf04112 b2 5b pop rbx
0x6b17cf04113 b3 4883fb02 REX.W cmpq rbx,0x2
0x6b17cf04117 b7 0f8511000000 jnz 0x6b17cf0412e <+0xce>
0x6b17cf0411d bd 49ba081a000601000000 REX.W movq r10,0x106001a08 ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf04127 c7 49c70200000000 REX.W movq [r10],0x0
0x6b17cf0412e ce 49bae819000601000000 REX.W movq r10,0x1060019e8 ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04138 d8 418f02 pop [r10]
0x6b17cf0413b db 5b pop rbx
0x6b17cf0413c dc 415f pop r15
0x6b17cf0413e de 415e pop r14
0x6b17cf04140 e0 415d pop r13
0x6b17cf04142 e2 415c pop r12
0x6b17cf04144 e4 4883c410 REX.W addq rsp,0x10
0x6b17cf04148 e8 5d pop rbp
0x6b17cf04149 e9 c3 retl
Handler Table (size = 24)
RelocInfo (size = 23)
0x6b17cf04068 external reference (Isolate::context_address) (0x106001978)
0x6b17cf04080 external reference (Heap::roots_array_start()) (0x106000048)
0x6b17cf04091 external reference (Isolate::c_entry_fp_address) (0x1060019e8)
0x6b17cf0409e external reference (Isolate::js_entry_sp_address) (0x106001a08)
0x6b17cf040b6 external reference (Isolate::js_entry_sp_address) (0x106001a08)
0x6b17cf040cc external reference (Isolate::pending_exception_address) (0x106001988)
0x6b17cf040e2 external reference (Isolate::handler_address) (0x1060019f0)
0x6b17cf040ef external reference (Isolate::handler_address) (0x1060019f0)
0x6b17cf040fd code target (BUILTIN) (0x6b17cf041c0)
0x6b17cf04103 external reference (Isolate::handler_address) (0x1060019f0)
0x6b17cf0411f external reference (Isolate::js_entry_sp_address) (0x106001a08)
0x6b17cf04130 external reference (Isolate::c_entry_fp_address) (0x1060019e8)
Object* orig_func = *new_target;
Object* func = *target;
Object* recv = *receiver;
Object*** argv = reinterpret_cast<Object***>(args);
if (FLAG_profile_deserialization && target->IsJSFunction()) {
PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
}
RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
CALL_GENEREATED_CODE
is defined in src/x64/simulator-x64.h
:
#define CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
(entry(p0, p1, p2, p3, p4))
Now, stub_entry is of type
JSEntryFunction` which is a typedef
typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
Object* receiver, int argc,
Object*** args);
stub_entry(orig_func, func, recv, argc, argv)
JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
From the comments about FUNCTION_CAST it is used to invoke generated code from within C:
// FUNCTION_CAST<F>(addr) casts an address into a function
// of type F. Used to invoke generated code from within C.
template <typename F>
F FUNCTION_CAST(Address addr) {
return reinterpret_cast<F>(reinterpret_cast<intptr_t>(addr));
}
Lets take a look at the arguments to entry which are:
stub_entry(orig_func, func, recv, argc, argv)
To follow the code in stub_entry
we need to step-inst (instruction level step) but there is not debug information
as this is generated code. What we can do is after every step call dissasemble --pc:
(lldb) target stop-hook add
Enter your stop hook command(s). Type 'DONE' to end.
> disassemble --pc
> DONE
Stop hook #1 added.
(lldb) undisplay
Now we should be able to step into using:
(lldb) si
Process 72301 stopped
* thread #1: tid = 0xc2e41e, 0x000006b17cf04060, queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x000006b17cf04060
-> 0x6b17cf04060: pushq %rbp
0x6b17cf04061: movq %rsp, %rbp
0x6b17cf04064: pushq $0x2
0x6b17cf04066: movabsq $0x106001978, %r10 ; imm = 0x106001978
Notice that the address matches that of the output from (lldb) job *code
above:
0x6b17cf04060 0 55 push rbp
0x6b17cf04061 1 4889e5 REX.W movq rbp,rsp
0x6b17cf04064 4 6a02 push 0x2
0x6b17cf04066 6 49ba7819000601000000 REX.W movq r10,0x106001978 ;; external reference (Isolate::context_address)
V8 uses Intel's assembly syntax and lldb uses AT&T syntax so the src/dest arguments are switched around, and registers prefixes are not used in Intel syntax.
Start of (lldb) job *code
:
0x6b17cf04060 0 55 push rbp
0x6b17cf04061 1 4889e5 REX.W movq rbp,rsp
0x6b17cf04064 4 6a02 push 0x2
0x6b17cf04066 6 49ba7819000601000000 REX.W movq r10,0x106001978 ;; external reference (Isolate::context_address)
0x6b17cf04070 10 4d8b12 REX.W movq r10,[r10]
I'm trying to line up the assembly code output with src/x64/code-stubs-x64.cc and JSEntryStub::Generate
and see if I'm
looking at the correct code:
push rbp
__ pushq(rbp);
movq rpb, rsp
__ movp(rbp, rsp);
push 0x2
-- Push(Immediate(StackFrame::TypeToMarker(type())))
This pushed the type of the stack frame which is an emmediate value 2. I think this value is taken from src/frames.h
and StackFrame class which has an enum:
enum Type {
NONE = 0,
STACK_FRAME_TYPE_LIST(DECLARE_TYPE)
NUMBER_OF_TYPES,
// Used by FrameScope to indicate that the stack frame is constructed
// manually and the FrameScope does not need to emit code.
MANUAL
};
And the first in the STACK_FRAME_LIST is:
#define STACK_FRAME_TYPE_LIST(V) \
V(ENTRY, EntryFrame) \
...
movq r10,0x106001978
matches:
ExternalReference context_address(IsolateAddressId::kContextAddress, isolate());
__ Load(kScratchRegister, context_address);
kScratchRegister
is r10 and we are moving 0x106001978
(the context_address) into register r10.
`movq r10,[r10]` __ Load(kScratchRegister, context_address); // get the pointer
`push r10` __ Push(kScratchRegister); // push the pointer onto the stack
`push r12` __ pushq(r12);
`push r13` __ pushq(r13);
`push r14` __ pushq(r14);
`push r15` __ pushq(r15);
`push rbx` __ pushq(rbx);
`movq r13,0x106000048` __ InitializeRootRegister(); // external reference (Heap::roots_array_start())
`addq r13,0x80` __ Push(c_entry_fp_operand);
`movq r10,0x1060019e8` __ Load(rax, js_entry_sp); ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04099 39 41ff32 push [r10]
0x6b17cf0409c 3c 48a1081a000601000000 REX.W movq rax,(0x106001a08) ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf040a6 46 4885c0 REX.W testq rax,rax
0x6b17cf040a9 49 0f8514000000 jnz 0x6b17cf040c3 <+0x63>
You might be wondering what that __
is, well it is a macro:
#define __ ACCESS_MASM(masm)
#define ACCESS_MASM(masm) masm->
So __
will get expanded to masm->pushq(rbp) for example. I think this is done so that it look more like assembly and
not so much as C++/C.
0x6b17cf04060 0 55 push rbp // prologue
0x6b17cf04061 1 4889e5 REX.W movq rbp,rsp // prologue
0x6b17cf04064 4 6a02 push 0x2 // Stack frame type = Context Address type
0x6b17cf04066 6 49ba7819000601000000 REX.W movq r10,0x106001978 // external reference (Isolate::context_address)
0x6b17cf04070 10 4d8b12 REX.W movq r10,[r10] // dereference the context_address pointer
0x6b17cf04073 13 4152 push r10 // push contents of r10 (ScratchRegister)
0x6b17cf04075 15 4154 push r12 // push contents of r12 argument passed
0x6b17cf04077 17 4155 push r13
0x6b17cf04079 19 4156 push r14
0x6b17cf0407b 1b 4157 push r15
0x6b17cf0407d 1d 53 push rbx // push
0x6b17cf0407e 1e 49bd4800000601000000 REX.W movq r13,0x106000048 // external reference (Heap::roots_array_start())
0x6b17cf04088 28 4981c580000000 REX.W addq r13,0x80 // push e_entry_fp_operand
0x6b17cf0408f 2f 49bae819000601000000 REX.W movq r10,0x1060019e8 // external reference (Isolate::c_entry_fp_address)
0x6b17cf04099 39 41ff32 push [r10] // push the scratch targed used above
0x6b17cf0409c 3c 48a1081a000601000000 REX.W movq rax,(0x106001a08) // external reference (Isolate::js_entry_sp_address)
0x6b17cf040a6 46 4885c0 REX.W testq rax,rax // check if rax is zero
0x6b17cf040a9 49 0f8514000000 jnz 0x6b17cf040c3 <+0x63> // jump if not zero (ZF = 0)
0x6b17cf040af 4f 6a02 push 0x2 //StackFrame::OUTERMOST_JSENTRY_FRAME
0x6b17cf040b1 51 488bc5 REX.W movq rax,rbp // __ movp(rax, rbp)
0x6b17cf040b4 54 48a3081a000601000000 REX.W movq (0x106001a08),rax // __ Store(js_entry_sp, rax)
0x6b17cf040be 5e e902000000 jmp 0x6b17cf040c5 <+0x65> ---------+
0x6b17cf040c3 63 6a00 push 0x0 | // __ Push(Immediate(StackFrame::INNER_JSENTRY_FRAME));
+-------------------------------------------------------------------+
|
0x6b17cf040c5 65 e916000000 jmp 0x6b17cf040e0 <+0x80> ---------+ // __ jmp(&invoke);
0x6b17cf040ca 6a 48a38819000601000000 REX.W movq (0x106001988),rax | // __ Store(pending_exception, rax)
0x6b17cf040d4 74 498b8588000000 REX.W movq rax,[r13+0x88] | // __ LoadRoot(rax, Heap::kExceptionRootIndex);
0x6b17cf040db 7b e932000000 jmp 0x6b17cf04112 <+0xb2> | // __ jump(&exit)
+-------------------------------------------------------------------+
|
0x6b17cf040e0 80 49baf019000601000000 REX.W movq r10,0x1060019f0 // external reference (Isolate::handler_address)
0x6b17cf040ea 8a 41ff32 push [r10] // push value of the pointer onto the stack
0x6b17cf040ed 8d 49baf019000601000000 REX.W movq r10,0x1060019f0 // PushStackHandler (Isolate::handler_address)
0x6b17cf040f7 97 498922 REX.W movq [r10],rsp //
0x6b17cf040fa 9a 6a00 push 0x0 // RelocInfo::CODE_TARGET ?
0x6b17cf040fc 9c e8bf000000 call 0x6b17cf041c0 (JSEntryTrampoline) // code: BUILTIN
0x6b17cf04101 a1 49baf019000601000000 REX.W movq r10,0x1060019f0 // PopStackHandnler (Isolate::handler_address)
0x6b17cf0410b ab 418f02 pop [r10]
0x6b17cf0410e ae 4883c400 REX.W addq rsp,0x0
0x6b17cf04112 b2 5b pop rbx
0x6b17cf04113 b3 4883fb02 REX.W cmpq rbx,0x2
0x6b17cf04117 b7 0f8511000000 jnz 0x6b17cf0412e <+0xce>
0x6b17cf0411d bd 49ba081a000601000000 REX.W movq r10,0x106001a08 ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf04127 c7 49c70200000000 REX.W movq [r10],0x0
0x6b17cf0412e ce 49bae819000601000000 REX.W movq r10,0x1060019e8 ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04138 d8 418f02 pop [r10]
0x6b17cf0413b db 5b pop rbx
0x6b17cf0413c dc 415f pop r15
0x6b17cf0413e de 415e pop r14
0x6b17cf04140 e0 415d pop r13
0x6b17cf04142 e2 415c pop r12
0x6b17cf04144 e4 4883c410 REX.W addq rsp,0x10
0x6b17cf04148 e8 5d pop rbp
0x6b17cf04149 e9 c3 retl
0x6b17cf040fd code target (BUILTIN) (0x6b17cf041c0)
If we set a break point after CALL_GENERATED_CODE we will see that this code does return and a value is provided:
bool has_exception = value->IsException(isolate);
In this case there was no exceptions so:
isolate->clear_pending_message();
return Handle<Object>(value, isolate);
Handle<Object> result = custom.Call(callback);
RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, Object);
which will expend to :
Isolate* __isolate__ = (isolate);
DCHECK(!__isolate__->has_pending_exception());
if (__isolate__->has_scheduled_exception()) {
__isolate__->PromoteScheduledException();
return MaybeHandle<Object>();
}
In this case there will be a scheduled_exception.
Object* Isolate::PromoteScheduledException() {
Object* thrown = scheduled_exception();
clear_scheduled_exception();
// Re-throw the exception to avoid getting repeated error reporting.
return ReThrow(thrown);
}
(lldb) job thrown
0x136bbb969d61: [JS_ERROR_TYPE]
- map = 0x136bd865de81 [FastProperties]
- prototype = 0x136bddd8d679
- elements = 0x136ba4c82241 <FixedArray[0]> [HOLEY_SMI_ELEMENTS]
- properties = 0x136bbb969d79 <PropertyArray[3]> {
#stack: 0x136baa1ca9b9 <AccessorInfo> (const accessor descriptor)
#message: 0x136b27c6e981 <String[18]: dummy setter error> (data field 0) properties[0]
0x136ba4c87069 <Symbol: stack_trace_symbol>: 0x136bbb969f49 <JSArray[26]> (data field 1) properties[1]
}
ReThrow
will:
set_pending_exception(exception);
return heap()->exception();
``
Notes:
```c++
RETURN_ON_FAILED_EXECUTION_PRIMITIVE(bool);
This macro is defined as:
#define RETURN_ON_FAILED_EXECUTION_PRIMITIVE(T) \
EXCEPTION_BAILOUT_CHECK_SCOPED_DO_NOT_USE(isolate, Nothing<T>())
Which will expend to
EXCEPTION_BAILOUT_CHECK_SCOPED_DO_NOT_USE(isolate, Nothing<bool>())
#define EXCEPTION_BAILOUT_CHECK_SCOPED_DO_NOT_USE(isolate, Nothing<bool>) \
do { \
if (has_pending_exception) { \
call_depth_scope.Escape(); \
return Nothing<bool>; \
} \
} while (false)
So the last lines in v8::Object::Set function will be:
if (has_pending_exception) {
call_depth_scope.Escape();
return Nothing<bool>;
}
return Just(true);
Setters
I'm guessing getters/setters are added using V8 Accessor (TODO: look at the v8 examples I have)
factory.h
When debugging you might come across a call that invokes a function in V8's factory, for example:
isolate->factory()->undefined_value();
Now, in the debugger you see something like:
ROOT_LIST(ROOT_ACCESSOR)
Lets take a closer look at the ROOT_LIST
macro. It is defined in factory.h:
#define ROOT_ACCESSOR(type, name, camel_name) \
inline Handle<type> name() { \
return Handle<type>(bit_cast<type**>( \
&isolate()->heap()->roots_[Heap::k##camel_name##RootIndex])); \
}
ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR
src/heap/heap.h:
#define STRONG_ROOT_LIST(V)
...
V(Oddball, undefined_value, UndefinedValue)
Which would expand to:
inline Handle<Oddball> undefined_value() {
return Handle<Oddball>(bit_cast<Oddball**>(
&isolate()->heap()->roots_[Heap::kUndefinedValueRootIndex]));
Recall that Oddball describes objects null, undefined, true, and false.
bit_cast
can be found in src/base/macros.h.
bootstrap_node.js compilation and execution walkthrough
The goal of this section is to understand what happens when bootstrap_node.js is compiled and run mainly focusing on the V8 side of things.
$ lldb -- out/Debug/node --print-ast
(lldb) br s -f node::LoadEnvironment
(lldb) r
Lets step through to the following line in node::ExecuteString
:
Local<Value> f_value = ExecuteString(env, MainSource(env), script_name);
MainSource is a function in node_javascript.h
which is used by node_js2c
. This was documented earlier so I won't go
into details about it now. You can see the content using:
(lldb) jlh MainSource(env)
Which is the same thing as calling:
(lldb) expr (*(v8::internal::Object**)*MainSource(env))->Print()
jlh
can be found in the V8 source tree in tools/lldbinit
.
ExecuteString
is a function in node.cc which calls Compile
:
MaybeLocal<v8::Script> script = v8::Script::Compile(env->context(), source, &origin);
Will delegate to ScriptCompiler::Compile
, and then to ScriptCompiler::CompileUnboundInternal
which can be found in
deps/v8/src/api.cc
:
MaybeLocal<UnboundScript> ScriptCompiler::CompileUnboundInternal(
Isolate* v8_isolate, Source* source, CompileOptions options) {
...
i::MaybeHandle<i::SharedFunctionInfo> maybe_function_info = i::Compiler::GetSharedFunctionInfoForScript(
str, name_obj, line_offset, column_offset, source->resource_options,
source_map_url, isolate->native_context(), NULL, &script_data,
options, i::NOT_NATIVES_CODE, host_defined_options);
Compiler::GetSharedFunctionForScript
in deps/v8/src/compiler.cc
:
ParseInfo parse_info(script);
Zone compile_zone(isolate->allocator(), ZONE_NAME);
...
maybe_result = CompileToplevel(&parse_info, isolate);
Lets take a look at parse_info:
(lldb) job *parse_info.script()
0x169ddd9abfb1: [Script] in OldSpace
- source: 0x169df8f0d3d1 <Very long string[21923]>
- name: 0x169df8f0d3a1 <String[17]: bootstrap_node.js>
- line_offset: 0
- column_offset: 0
- type: 2
- id: 14
- context data: 0x169dd88022e1 <undefined>
- wrapper: 0x169dd88022e1 <undefined>
- compilation type: 0
- line ends: 0x169dd88022e1 <undefined>
- eval from shared: 0x169dd88022e1 <undefined>
- eval from position: 0
- shared function infos: 0x169dd8802251 <FixedArray[0]>
This will be passed to CompileToplevel
:
if (parse_info->literal() == nullptr &&
!parsing::ParseProgram(parse_info, isolate)) {
...
std::forward_list<std::unique_ptr<CompilationJob>> inner_function_jobs;
std::unique_ptr<CompilationJob> outer_function_job(
GenerateUnoptimizedCode(parse_info, isolate, &inner_function_jobs));
...
In this case
(lldb) expr parse_info->literal() == nullptr
(bool) $80 = true
First, parsing::ParseProgram
will parse the JavaScript and produce the abstract syntax tree.
ParseProgram
:
result = parser.ParseProgram(isolate, info);
info->set_literal(result);
We can inspect parse_info after this function returns:
(lldb) expr parse_info->literal()->Print()
FUNC LITERAL at 0
. NAME <nil>
. INFERRED NAME 1
Now, this does not look like much but this is only the function literal:
(function(process) {
});
We can print the AST generated using:
(lldb) expr AstPrinter(isolate()).PrintProgram(parse_info()->literal())
(const char *) $56 = 0x0000000105016e20 "FUNC at 0\n. KIND 0\n. SUSPEND COUNT 0\n. NAME ""\n. INFERRED NAME ""\n. EXPRESSION STATEMENT at 284\n. . LITERAL "use strict"\n. EXPRESSION STATEMENT at 299\n. . ASSIGN at -1\n. . . VAR PROXY local[0] (0x10682f138) (mode = TEMPORARY) ".result"\n. . . FUNC LITERAL at 300\n. . . . NAME \n. . . . INFERRED NAME \n. . . . PARAMS\n. . . . . VAR (0x10681cd48) (mode = VAR) "process"\n. RETURN at -1\n. . VAR PROXY local[0] (0x10682f138) (mode = TEMPORARY) ".result"\n"
I've not figured out a good way to make this print nicely in lldb so using this is the more readable output:
[generating bytecode for function: ]
--- AST ---
FUNC at 0
. KIND 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. EXPRESSION STATEMENT at 284
. . LITERAL "use strict"
. EXPRESSION STATEMENT at 299
. . ASSIGN at -1
. . . VAR PROXY local[0] (0x10683f938) (mode = TEMPORARY) ".result"
. . . FUNC LITERAL at 300
. . . . NAME
. . . . INFERRED NAME
. . . . PARAMS
. . . . . VAR (0x10682d548) (mode = VAR) "process"
. RETURN at -1
. . VAR PROXY local[0] (0x10683f938) (mode = TEMPORARY) ".result"
Notice that we have use strict
starting a character 284
. This due to the comment that preceeds it.
We can see that the function literal starts at 300
and that if we would have given it a name it would
have shown up as NAME somename
. We can also see that it takes a parameter named process
Next GenerateUnoptimizedCode
will be called (deps/v8/src/compiler.cc
):
Compiler::EagerInnerFunctionLiterals inner_literals;
if (!Compiler::Analyze(parse_info, &inner_literals)) {
return std::unique_ptr<CompilationJob>();
}
std::unique_ptr<CompilationJob> outer_function_job(
PrepareAndExecuteUnoptimizedCompileJob(parse_info, parse_info->literal(), isolate));
PrepareAndExecuteUnoptimizedCompileJob
:
if (job->PrepareJob() == CompilationJob::SUCCEEDED &&
job->ExecuteJob() == CompilationJob::SUCCEEDED) {
PrepareJob()
will print the ast as shown above. Lets take a closer look at ExecuteJob
:
return UpdateState(ExecuteJobImpl(), State::kReadyToFinalize);
InterpreterCompilationJob::ExecuteJobImpl
:
generator()->GenerateBytecode(stack_limit());
This will land in BytecodeGenerator::GenerateBytecode
bytecode-generator.cc:906
GenerateBytecodeBody();
This function will visit all of the nodes in the AST and generate the bytecodes for them.
Later in FinalizeUnoptimizedCode
:
outer_function_job->compilation_info()->set_shared_info(shared_info);
(lldb) expr shared_info->code()->Print()
0x1a00905c50e1: [Code]
kind = BUILTIN
name = CompileLazy
compiler = unknown
Instructions (size = 983)
0x1a00905c5140 0 488b5f2f REX.W movq rbx,[rdi+0x2f]
0x1a00905c5144 4 488b5b07 REX.W movq rbx,[rbx+0x7]
0x1a00905c5148 8 493b5da0 REX.W cmpq rbx,[r13-0x60]
0x1a00905c514c c 0f844c030000 jz 0x1a00905c549e (CompileLazy)
...
Later in a call to InterpreterCompilationJob::FinalizeJobImpl
will delegate to BytecodeGenerator::FinalizeBytecode
where the the BytecodeArray is generated:
Handle<BytecodeArray> bytecode_array = builder()->ToBytecodeArray(isolate);
The bytecodes can be inspected using:
```console
(lldb) expr bytecodes->Print()
0x39d9bd7ade49: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
0 E> 0x39d9bd7ade82 @ 0 : 93 StackCheck
15 S> 0x39d9bd7ade83 @ 1 : 6f 00 00 00 CreateClosure [0], [0], #0
0x39d9bd7ade87 @ 5 : 1e fb Star r0
21644 S> 0x39d9bd7ade89 @ 7 : 97 Return
Constant pool (size = 1)
0x39d9bd7ade31: [FixedArray] in OldSpace
- map = 0x39d9d04022f1 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x39d9bd7add81 <SharedFunctionInfo bajja>
Handler Table (size = 16)
This bytecode array will be set on the compilation_info instance:
compilation_info()->SetBytecodeArray(bytecodes);
And the code will be set to InterpreterEntryTrampoline
:
compilation_info()->SetCode(
BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));
return SUCCEEDED;
This will return us into FinalizeUnoptimizedCompilationJob
:
if (status == CompilationJob::SUCCEEDED) {
InstallUnoptimizedCode(compilation_info);
CodeEventListener::LogEventsAndTags log_tag;
InstallUnoptimizedCode
will set up the FeedbackMetadata
:
Handle<FeedbackMetadata> feedback_metadata = FeedbackMetadata::New(
compilation_info->isolate(),
compilation_info->literal()->feedback_vector_spec());
compilation_info->shared_info()->set_feedback_metadata(*feedback_metadata);
We can inspect feedback_metadata
using:
(lldb) expr feedback_metadata->Print()
0x39d9bd7adea9: [FeedbackMetadata] in OldSpace
- length: 2
- slot_count: 1
Slot #0 kCreateClosure
Back in InstallUnoptimizedCode
:
shared->set_code(*compilation_info->code())
shared->set_bytecode_array(*compilation_info->bytecode_array());
This is setting the SharedFunctionInfo instance code to InterpreterEntryTrampoline
and the bytecode array
is also set that we saw before.
After this control will be returned to FinalizeUnoptimizedCompilationJob
and we go through all the inner_function_jobs
and set the SharedFunctionInfo for them.
for (auto&& inner_job : *inner_function_jobs) {
Handle<SharedFunctionInfo> inner_shared_info =
Compiler::GetSharedFunctionInfo(
inner_job->compilation_info()->literal(), parse_info->script(),
isolate);
if (inner_shared_info->is_compiled()) continue;
inner_job->compilation_info()->set_shared_info(inner_shared_info);
if (FinalizeUnoptimizedCompilationJob(inner_job.get()) !=
CompilationJob::SUCCEEDED) {
return false;
}
}
We can inspect inner_shared_info
using:
(lldb) expr inner_shared_info->Print()
...
(lldb) expr inner_shared_info->code()->Print()
0x1a00905c50e1: [Code]
kind = BUILTIN
name = CompileLazy
...
After this the shared_info will be returned from CompileToplevel
. Which will the return to Compiler::GetSharedFunctionInfoForScript
:
maybe_result = CompileToplevel(&parse_info, isolate);
...
Handle<FeedbackVector> feedback_vector = FeedbackVector::New(isolate, result);
vector = isolate->factory()->NewCell(feedback_vector);
compilation_cache->PutScript(source, context, language_mode, result, vector);
(lldb) expr feedback_vector->Print()
0x265e1d0b03b1: [FeedbackVector] in OldSpace
- length: 1
SharedFunctionInfo: 0x265e1d0adae9 <SharedFunctionInfo>
Optimized Code: 0
Invocation Count: 0
Profiler Ticks: 0
Slot #0 kCreateClosure
[0]: 0x265e1d0b03e1 <Cell value= 0x265eb5c022e1 <undefined>>
So we are backing out of the calls now and the next return will land us in CompileUnboundInternal
:
has_pending_exception = !maybe_function_info.ToHandle(&result);
This will later return to ScriptCompiler::Compile
:
auto maybe = CompileUnboundInternal(isolate, source, options);
...
v8::Context::Scope scope(context);
return result->BindToCurrentContext();
(lldb) expr maybe
(lldb) expr maybe.ToLocalChecked()->GetId()
(int) $415 = 14
(lldb) jlh maybe.ToLocalChecked()->GetScriptName()
"bootstrap_node.js"
An UnboundScript
is a compiled JavaScript but it is not yet tied to a Context.
After this the compilation is finished and control will be returned to node.cc which will now run the script:
Local<Value> result = script.ToLocalChecked()->Run();
So we have compiled the script and now are are going to run it. Just to clarify something here, the script contains an expression
(the surrounding ()) that defines a function. So we are not executing the startup
function here but instead only only the
expression that contains it.
Run
will call Execution::Call
, which will call CallInternal
, which will call Invoke
:
typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
Object* receiver, int argc,
Object*** args);
Handle<Code> code = is_construct
? isolate->factory()->js_construct_entry_code()
: isolate->factory()->js_entry_code();
...
// start of the function identified by code->entry() address.
JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
// Call the function through the right JS entry stub.
Object* orig_func = *new_target;
Object* func = *target;
Object* recv = *receiver;
Object*** argv = reinterpret_cast<Object***>(args);
if (FLAG_profile_deserialization && target->IsJSFunction()) {
PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
}
RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
}
Notice that code
will be what isolate->factory()->js_entry_code()
returns:
(lldb) expr isolate->factory()->js_entry_code()->Print()
0xe6d84b84001: [Code]
kind = STUB
major_key = JSEntryStub
compiler = unknown
Instructions (size = 232)
0xe6d84b84060 0 55 push rbp
0xe6d84b84061 1 4889e5 REX.W movq rbp,rsp
0xe6d84b84064 4 6a02 push 0x2
...
I'm not showing the complete output above as this will be shown later in detail.
Next, we have CALL_GENERATED_CODE
which can be found in src/x64/simulator-x64.h
:
// Since there is no simulator for the x64 architecture the only thing we can
// do is to call the entry directly.
// TODO(X64): Don't pass p0, since it isn't used?
#efine CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
(entry(p0, p1, p2, p3, p4))
So this will be an call that looks like this:
entry(orig_func, func, recv, argc, argv);
If we use step instruction
(si) we can follow the setup and calling of this function:
0x100e381b4 <+1348>: movq -0x118(%rbp), %rdx // move the value of address of local variable code->entry() into rdx
0x100e381bb <+1355>: movq -0x120(%rbp), %rdi // first argument which is local variable orig_func
-> 0x100e381c2 <+1362>: movq -0x128(%rbp), %rsi // second argument which is local variable func
0x100e381c9 <+1369>: movq -0x130(%rbp), %rcx // third argument which is local variable recv
0x100e381d0 <+1376>: movl -0x30(%rbp), %eax // fourth arg which is local variable argc
0x100e381d3 <+1379>: movq -0x138(%rbp), %r8 // fifth argument which is local variable argv
0x100e381da <+1386>: movq %rdx, -0x1c0(%rbp) // move code->entry from rdx into local variable
0x100e381e1 <+1393>: movq %rcx, %rdx // move recv into rdx
0x100e381e4 <+1396>: movl %eax, %ecx // move argc into ecx
0x100e381e6 <+1398>: movq -0x1c0(%rbp), %r9 // move code-entry into register r9
0x100e381ed <+1405>: callq *%r9 // call address in register r9
Below we verify the contents of the above instructions:
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x118`
0x7fff5fbfd828: 0x0000302f34084060
(lldb) expr code->entry()
(byte *) $186 = 0x0000302f34084060
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x120`
0x7fff5fbfd820: 0x00001d64e5d822e1
(lldb) expr orig_func
(v8::internal::Object *) $209 = 0x00001d64e5d822e1
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x128`
0x7fff5fbfd818: 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $211 = 0x00001d6417d30669
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x130`
0x7fff5fbfd810: 0x00001d64f0a82239
(lldb) expr recv
(v8::internal::Object *) $213 = 0x00001d64f0a82239
(lldb) memory read -f x -c 1 -s 4 `$rbp - 0x30`
0x7fff5fbfd910: 0x00000000
(lldb) expr argc
(int) $216 = 0
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x138`
0x7fff5fbfd808: 0x0000000000000000
(lldb) expr argv
(v8::internal::Object ***) $219 = 0x0000000000000000
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x1c0`
0x7fff5fbfd780: 0x0000302f34084060
The last instruction callq *%r9
will call into JSEntryStub:
-> 0x302f34084060: pushq %rbp // push callers base frame pointer saving it so we can restore it
(lldb) memory read -f x -c 1 -s 8 `$rsp` // inspect the stack
0x302f34084061: movq %rsp, %rbp // mov the current value of rsp to into rbp which will be the frame pointer for this function
0x302f34084064: pushq $0x2 // this is pushing an immediate value 2 onto the stack. Where does this come from, the following:
(lldb) expr v8::internal::StackFrame::MarkerToType(2)
(v8::internal::StackFrame::Type) $494 = ENTRY
v8::internal::StackFrame::Type::ENTRY))
(int32_t) $262 = 2
(lldb) memory read -f x -c 2 -s 8 `$rsp` // inspect the stack
0x302f34084066: movabsq $0x106001990, %r10 // move the context address into r10
(lldb) up 2
(lldb) expr isolate->isolate_addresses_[IsolateAddressId::kContextAddress]
(v8::internal::Address) $230 = 0x0000000106001990
0x302f34084070: movq (%r10), %r10 // the context address is a pointer, this will dereference it
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
(lldb) register read r10
r10 = 0x00001d6417d03a59
0x302f34084073: pushq %r10 // push the dereferences context onto the stack
0x302f34084075: pushq %r12 // r12 must be preserved accross function calls so save it and pop it later before returning
0x302f34084077: pushq %r13 // r13 must be preserved accross function calls so save it and pop it later before returning
0x302f34084079: pushq %r14 // r14 must be preserved accross function calls so save it and pop it later before returning
0x302f3408407b: pushq %r15 // r14 must be preserved accross function calls so save it and pop it later before returning
0x302f3408407d: pushq %rbx // rbx must be preserved accross function calls so save it and pop it later before returning
0x302f3408407e: movabsq $0x106000048, %r13 // move the value of the roots_array_start into r13
(lldb) expr isolate->heap()->roots_array_start()
(v8::internal::Object **) $233 = 0x0000000106000048
0x302f34084088: addq $0x80, %r13 // addp(kRootRegister, Immediate(kRootRegisterBias)); kRootRegisterBias is 128
0x302f3408408f: movabsq $0x106001a00, %r10 // move he CEntryFPAddress into r10
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kCEntryFPAddress]
0x106001a00: 0x0000000000000000
0x302f34084099: pushq (%r10) // dereference CEntryAddress and push onto the stack
(lldb) memory read -f x -c 1 -s 8 0x0000000106001a00
0x106001a00: 0x0000000000000000
0x302f3408409c: movabsq 0x106001a20, %rax // move JSEntrySPAddress into rax
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
0x106001a20: 0x0000000000000000
0x302f340840a6: testq %rax, %rax // is rax zero? if so there is an outer js call
(lldb) register read rax
rax = 0x0000000000000000
0x302f340840af: pushq $0x2 // v8::internal::StackFrame::OUTERMOST_JSENTRY_FRAME))
0x302f340840b1: movq %rbp, %rax // move this functions base pointer into rax
0x302f340840b4: movabsq %rax, 0x106001a20 // store the base pointer in isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x0000000000000000
after
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x00007fff5fbfd760
(lldb) register read rbp
rbp = 0x00007fff5fbfd940
0x302f340840be: jmp 0x302f340840c5
0x302f340840c5: jmp 0x302f340840e0
0x302f340840e0: movabsq $0x106001a08, %r10 // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kHandlerAddress]`
0x106001a08: 0x0000000000000000
0x302f340840ea: pushq (%r10) // push the dereferenced handler address
0x302f340840ed: movabsq $0x106001a08, %r10 // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
0x302f340840f7: movq %rsp, (%r10) // move the value of the current stack pointer into the object pointed to be r10
(lldb) memory read -f x -c 1 -s 8 0x106001a08
0x106001a08: 0x00007fff5fbfd710
(lldb) register read rsp
rsp = 0x00007fff5fbfd710
0x302f340840fa: callq 0x302f341418c0 // call JSEntryTrampoline builtin
(lldb) expr isolate->builtins()->builtin_handle(Builtins::Name::kJSEntryTrampoline)->entry()
(byte *) $255 = 0x0000302f341418c0
In deps/v8/src/builtins/x64/builtins-x64.cc
we can find Generate_JSEntryTrampolineHelper
which is what generates the builtin. As this is done
a compile time we can put a break point in it (you can debug mksnapshot though which is done elsewhere in this document).
0x302f341418c0: movq %rdi, %r11 // move the orig_fun into r11
0x302f341418c3: movq %rsi, %rdi // move func into rdi
0x302f341418c6: xorl %esi, %esi // zero out esi
0x302f341418c8: pushq %rbp // EnterFrame (deps/v8/src/x64/macro-assembler-x64.cc)
0x302f341418c9: movq %rsp, %rbp //
0x302f341418cc: pushq $0x1c // push 0x1c (decimal 28)
(lldb) p v8::internal::StackFrame::TypeToMarker(static_cast<v8::internal::StackFrame::Type>(v8::internal::StackFrame::Type::INTERNAL))
(int32_t) $256 = 28
0x302f341418ce: movabsq $0x302f34141861, %r10 // CodeObject() what is this, seems like it is Handle<HeapObject> which will be patched later
// I think this might be the register file?
(lldb) memory read -f x -c 1 -s 8 0x302f34141861
0x302f34141861: 0x6900001d64ceb827
0x302f341418d8: pushq %r10 // push the HandleHeapObject into the stack
0x302f341418da: movabsq $0x1d64e5d822e1, %r10 // move undefined value into r10
(lldb) expr *isolate->factory()->undefined_value()
(v8::internal::Oddball *) $264 = 0x00001d64e5d822e1
0x302f341418e4: cmpq %r10, (%rsp)
0x302f341418e8: jne 0x302f341418fa // last of EnterFrame if we don't abort that is
0x302f341418fa: movabsq $0x106001990, %r10 // move the ContextAdress into r10 (the scratch register for x86)
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
0x302f34141904: movq (%r10), %rsi // deref and move into context into rsi
0x302f34141907: pushq %rdi // push function onto the stack
0x302f34141908: pushq %rdx // push recv onto the stack (this was moved above with 0x302f34141908: pushq %rdx)
0x302f34141909: movq %rcx, %rax // move argc into rax (this was moved above with 0x100e381e4 <+1396>: movl %eax, %ecx)
0x302f3414190c: movq %r8, %rbx // move argv into rbx
0x302f3414190f: movq %r11, %rdx // move orig_func into rdx
// Generate_CheckStackOverflow TODO: got through this as I struggled to understand/map the generated instructions to the source code :(
0x302f34141912: movq 0xd08(%r13), %r10 //
0x302f34141919: movq %rsp, %rcx
0x302f3414191c: subq %r10, %rcx
0x302f3414191f: movq %rax, %r11
0x302f34141922: shlq $0x3, %r11
0x302f34141926: cmpq %r11, %rcx
0x302f34141929: jg 0x302f34141940
// Generate_JSEntryTrampolineHelper
0x302f34141940: xorl %ecx, %ecx // zero out ecx, for the following loop
0x302f34141942: jmp 0x302f3414194f // jump to entry label
0x302f3414194f: cmpq %rax, %rcx // compare argc with rcx (this was moved above).
0x302f34141952: jne 0x302f34141944 // entry the loop if. This is pushing the argv values onto the stack in a loop, but we don't have any so we fall through
0x302f34141954: callq 0x302f3413c400 //
(lldb) expr isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $279 = 0x0000302f3413c400 "@\xfffffff6\xffffffc7\x01\x0f\xffffff84F"
// for the full object with assembly language instructions:
(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// Builtins::Generate_Call_ReceiverIsAny 'deps/v8/src/builtins/builtins-call-gen.cc':
// void Builtins::Generate_Call_ReceiverIsAny(MacroAssembler* masm) {
// Generate_Call(masm, ConvertReceiverMode::kAny);
// }
// Builtins::Generate_Call `deps/v8/src/builtins/x64/builtins-x64.cc`
0x302f3413c400: testb $0x1, %dil // move the immediate 1 into the low 8 bits or rdi // __ JumpIfSmi(rdi, &non_callable); which consists of CheckSmi
0x302f3413c404: je 0x302f3413c450 // if ZF = 1 then jump. This would happen if rdi was a SMI// __ JumpIfSmi(rdi, &non_callable: which after CheckSmi will jump. rdi is the target and not a smi in our case
// recall that rdi is the function:
(lldb) register read rdi
rdi = 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $280 = 0x00001d6417d30669
Now I think that func is/was of type JSFunction (deps/v8/src/objects.h) as it was cast to Object* by:
auto fun = i::Handle<i::JSFunction>::cast(Utils::OpenHandle(this));
class JSFunction: public JSObject {
public:
// [prototype_or_initial_map]:
DECL_ACCESSORS(prototype_or_initial_map, Object)
0x302f3413c40a: movq -0x1(%rdi), %rcx // move the map into rcx. CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx). HeapObject::kMapOffset which is the first field in JSFunction:
(lldb) memory read -f x -c 1 -s 8 `$rdi - 1`
0x1d6417d30668: 0x00001d6433c82521
(lldb) expr JSFunction::cast(func)->prototype_or_initial_map()
(v8::internal::Object *) $286 = 0x00001d64e5d82321
0x302f3413c40e: cmpb $-0x1, 0xb(%rcx) // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
0x302f3413c412: je 0x302f3413be20 // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
// I was not sure where to find this `j` function but it is in src/x64/assembler-x64.cc (Assembler::j but note
// that there are multiple overloaded j functions so make sure you are looking at the correct one. There is
// as section that discusses Assembler::j in detail later in this document.
// so what are we jumping to? We can back up in the debugger and find out:
// (lldb) up 2
// (lldb job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// This will be a builtin named CallFunction_ReceiverIsAny, and being a builtin you'll find it in
// `src/builtins/builtins-definitions.h`. The implementation will be in `src/builtins/builtins-call.cc` and
// will result in `return builtin_handle(kCallFunction_ReceiverIsAny)`. This code repsonsible for generating
// is Builtins::Generate_CallFunction and in our case that means `src/builtins/x64/builtins-x64.cc`
(lldb) expr isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $317 = 0x0000302f3413be20
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc)
0x302f3413be20: testb $0x1, %dil // AssertFunction (deps/v8/src/x64/macro-assembler-x64.cc) check if rdi is of type smi (testb(object, Immediate(kSmiTagMask));)
// rdi is the function to call:
(lldb) register read rdi
rdi = 0x00001d6417d30669
(lldb) expr JSFunction::cast(func)
(v8::internal::JSFunction *) $320 = 0x00001d6417d30669
0x302f3413be24: jne 0x302f3413be36 // this is also generated by AssertFunciton and the call to Assembler::j If not equal this code will fall through
0x302f3413be36: pushq %rdi // AssertFunction still, push rdi (the function) onto the stack
0x302f3413be37: movq -0x1(%rdi), %rdi // move the map into rdi. CmpObjectType(object, JS_FUNCTION_TYPE, object). HeapObject::kMapOffset which is the first field in JSFunction
0x302f3413be3b: cmpb $-0x1, 0xb(%rdi) // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
0x302f3413be3f: popq %rdi // pop function from stack into rdi again
0x302f3413be40: je 0x302f3413be52 // jump if equal will jump to a L label and return from the Check call and then return from AssertFunction
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc)
0x302f3413be52: movq 0x1f(%rdi), %rdx // move the SharedFunctionInfo into rdx:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x1f`
0x1d6417d30688: 0x00001d6417d2dc21
(lldb) expr JSFunction::cast(func)->shared()
(v8::internal::SharedFunctionInfo *) $325 = 0x00001d6417d2dc21
0x302f3413be56: testb $-0x20, 0x87(%rdx) // testl(FieldOperand(rdx, SharedFunctionInfo::kCompilerHintsOffset)
(lldb) expr JSFunction::cast(func)->shared()->compiler_hints()
(int) $332 = 1056770
(lldb) memory read -f dec -c 1 -s 4 `($rdx + 0x87)`
0x1d6417d2dca8: 1056770
0x302f3413be5d: jne 0x302f3413bfbc // __ j(not_zero, &class_constructor);
0x302f3413be63: movq 0x27(%rdi), %rsi // move the functions context info rsi:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x27`
0x1d6417d30690: 0x00001d6417d03a59
(lldb) expr JSFunction::cast(func)->context()
(v8::internal::Context *) $345 = 0x00001d6417d03a59
0x302f3413be67: testb $0x3, 0x87(%rdx) // check SharedFunctionInfo::IsNativeBit::kMask | SharedFunctionInfo::IsStrictBit::kMask
0x302f3413be6e: jne 0x302f3413bf14 //
0x302f3413bf14: movslq 0x73(%rdx), %rbx // __ movsxlq(rbx, FieldOperand(rdx, SharedFunctionInfo::kFormalParameterCountOffset))
0x302f3413bf18: movabsq $0x105300b92, %r10 // move hook_on_function_call_address into scratch register; part of CheckDebugHook
(lldb) expr isolate->debug()->hook_on_function_call_address()
(v8::internal::Address) $353 = 0x0000000105300b92
0x302f3413bf22: cmpb $0x0, (%r10) // how is this generate? In CheckDebugHook I can only find cmpb(debug_hook_active_operand, Immediate(0)); but not he previous moveabsq
0x302f3413bf26: je 0x302f3413bfa4 // will jump to the label at the end of CheckDebugHook
// MacroAssembler::InvokeFunction
0x302f3413bfa4: movq -0x60(%r13), %rdx // move the UndefinedValueRootIndex into rdx generated by LoadRoot(rdx, Heap::kUndefinedValueRootIndex);
(lldb) memory read -f x -c 1 -s 8 `$r13 - 0x60`
0x106000068: 0x00001d64e5d822e1
lldb) expr isolate->heap()->roots_[Heap::RootListIndex::kUndefinedValueRootIndex]
(v8::internal::Object *) $354 = 0x00001d64e5d822e1
// InvokePrologue(expected, actual, &done, &definitely_mismatches, flag, Label::kNear)
0x302f3413bfa8: cmpq %rax, %rbx // Set(rax, actual.immediate());
0x302f3413bfab: je 0x302f3413bfb2 // will return from InvokePrologue
0x302f3413bfb2: movq 0x37(%rdi), %rcx // move the function code into rcx (movp(rcx, FieldOperand(function, JSFunction::kCodeOffset)))
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x37`
0x1d6417d306a0: 0x0000302f34144281
(lldb) expr JSFunction::cast(func)->code()
(v8::internal::Code *) $358 = 0x0000302f34144281
0x302f3413bfb6: addq $0x5f, %rcx // addp(rcx, Immediate(Code::kHeaderSize - kHeapObjectTag)); the instruction start follows the Code object header
(lldb) job JSFunction::cast(func)->code()
0x302f34144281: [Code]
yind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x302f341442e0 0 488b5f2f REX.W movq rbx,[rdi+0x2f]
(lldb) memory read -f x -c 1 -s8 `$rcx + 0x5f`
0x302f341442e0: 0x075b8b482f5f8b48
// notice that rcx now point to the first instruction.
0x302f3413bfba: jmpq *%rcx // now jump to the first instruction of function :)
0x302f341442e0: movq 0x2f(%rdi), %rbx // move the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x2f`
0x1d6417d30698: 0x00001d6417d306e9
(lldb) expr JSFunction::cast(func)->feedback_vector()
(v8::internal::FeedbackVector *) $363 = 0x00001d6417d306a9
(lldb) job JSFunction::cast(func)->feedback_vector()
0x1d6417d306a9: [FeedbackVector] in OldSpace
- length: 1
SharedFunctionInfo: 0x1d6417d2dc21 <SharedFunctionInfo>
Optimized Code: 0
Invocation Count: 0
Profiler Ticks: 0
Slot #0 kCreateClosure
[0]: 0x1d6417d306d9 <Cell value= 0x1d64e5d822e1 <undefined>>
302f341442e4: movq 0x7(%rbx), %rbx // move Slot[0] from the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0x7`
0x1d6417d306f0: 0x00001d6417d306a9
// MaybeTailCallOptimizedCodeSlot(masm, feedback_vector, rcx, r14, r15);
// MaybeTailCallOptimizedCodeSlot(MacroAssembler* masm, Register feedback_vector, Register scratch1, Register scratch2, Register scratch3)
// Register closure = rdi;
// Register optimized_code_entry = rcx;
0x302f341442e8: movq 0xf(%rbx), %rcx // __ movp(optimized_code_entry, FieldOperand(feedback_vector, FeedbackVector::kOptimizedCodeOffset));
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0xf`
0x1d6417d306b8: 0x0000000000000000
(lldb) expr JSFunction::cast(func)->feedback_vector()->optimized_code()
(v8::internal::Code *) $367 = 0x0000000000000000
0x302f341442ec: testb $0x1, %cl // smi test against low 8 bit of rcx called from JumpIfNotSmi -> CheckSmi
0x302f341442ef: jne 0x302f34144486 // JumpIfNotSmi -> Assembler::j
0x302f341442f5: testb $0x1, %cl // test again but this time from MacroAssembler::SmiCompare and its call to AssertSmi which calls CheckSmi
0x302f341442f8: je 0x302f3414430a // SmiCompare -> AssertSmi -> Check
Look into static inline Code* GetCodeFromTargetAddress(Address address); which could possible be used to get the code of an address which could be useful when debugging and you only have the address that is being jumped to.
(lldb) expr JSFunction::cast(func)->code()
(v8::internal::Code *) $477 = 0x00000e6d84c44281
(lldb) expr JSFunction::cast(func)->shared()->abstract_code()->Print()
0x265e1d0ade49: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
0 E> 0x265e1d0ade82 @ 0 : 93 StackCheck
15 S> 0x265e1d0ade83 @ 1 : 6f 00 00 00 CreateClosure [0], [0], #0
0x265e1d0ade87 @ 5 : 1e fb Star r0
21644 S> 0x265e1d0ade89 @ 7 : 97 Return
Constant pool (size = 1)
0x265e1d0ade31: [FixedArray] in OldSpace
- map = 0x265ed93822f1 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x265e1d0add81 <SharedFunctionInfo bajja>
Handler Table (size = 16)
Side note on the interpreters dispatch_table_ ...
Every isolate has a interpreter as a member. An interpreter has a dispatch_table_ array of Address's (byte*):
```console
(lldb) expr isolate->interpreter()->dispatch_table_
(Address [768]) $292 = {
Notice that these are addresses which are index by Bytecode's. For example lets take a look the address for
CreateClosure
:
(lldb) expr isolate->interpreter()->GetBytecodeHandler(static_cast<Bytecode>(Bytecode::kCreateClosure),static_cast<OperandScale>(1))->Print()
How is `dispatch_table_` populated?
This is done in the `Heap::IterateStrongRoots` function:
```c++
isolate_->interpreter()->IterateDispatchTable(v);
Which is called by StartupDeserializer::DeserializeInto
:
isolate->heap()->IterateStrongRoots(this, VISIT_ONLY_STRONG_ROOT_LIST);
isolate->heap()->IterateSmiRoots(this);
isolate->heap()->IterateStrongRoots(this, VISIT_ONLY_STRONG);
which is called by Isolate::Init
:
if (!create_heap_objects) des->DeserializeInto(this);
PrepareAndExecuteUnoptimizedCompileJob
deps/v8/src/compiler.cc
0x169cbdb41e04 external reference (Runtime::StackGuard) (0x101440100)
0x169cbdb41e0d code target (STUB) (0x169cbda84740)
If you look closely the first instruction is just setting rax to zero using xor.
Next, we are pushing the pointer to the function, in this case Runtime::StackGuard
into rbx.
We then jump to CEntryStub
.
What is Runtime::StackGuard
?
Well, it is defined in a macro in src/runtime/runtime.h
:
...
F(StackGuard, 0, 1)
...
#define FOR_EACH_INTRINSIC(F) \
FOR_EACH_INTRINSIC_RETURN_PAIR(F) \
FOR_EACH_INTRINSIC_RETURN_OBJECT(F)
#define F(name, nargs, ressize) \
Object* Runtime_##name(int args_length, Object** args_object, \
Isolate* isolate);
FOR_EACH_INTRINSIC_RETURN_OBJECT(F)
#undef F
StackGuard
is included in FOR_EACH_INTRINSIC_INTERNAL
which is included by FOR_EACH_INTRINSIC_RETURN_OBJECT
.
So that should expand to:
Object* Runtime_StackGuard(int args_lentgh, Object** args_object, Isolate* isolate);
If we take a look at src/runtime/runtime.h
we can find:
static const Runtime::Function kIntrinsicFunctions[] = {
FOR_EACH_INTRINSIC(F)
FOR_EACH_INTRINSIC(I)
};
And the indexes into this array are in the enum Runtime::FunctionId
:
(lldb) expr kIntrinsicFunctions[v8::internal::Runtime::FunctionId::kStackGuard]
(const Function) $371 = (function_id = kStackGuard, intrinsic_type = RUNTIME, name = "StackGuard", entry = "UH\xffffff89\xffffffe5H\xffffff83\xffffffecP\xffffff89}\xfffffff4H\xffffff89u\xffffffe8H\xffffff89U\xffffffe0H\xffffff8b}\xffffffe0\xffffffe8\xffffffb4d\x04\xffffffff\xffffffb1\x01H\xffffff83\xfffffff8", nargs = '\0', result_size = '\x01')
There is also a function named Runtime::FunctionForId
which can be used:
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))
(const Function *) $1058 = 0x00000001029f0340
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->name
(const char *const) $1059 = 0x0000000101c8be27 "StackGuard"
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->nargs
(int8_t) $1060 = '\0'
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->intrinsic_type
(const IntrinsicType) $1061 = RUNTIME
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->entry
(v8::internal::Address) $1063 = 0x000000010142de70 "UH\x89�H��P�}�H�u�H�U�H�}���\x14\x04��\x01H\x83�
If nargs
is -1
then the funnction takes a variable number of arguments.
We can also disassemble the entry
address using:
(lldb) dis -s 0x000000010142de70
node`v8::internal::Runtime_StackGuard:
0x10142de70 <+0>: pushq %rbp
0x10142de71 <+1>: movq %rsp, %rbp
0x10142de74 <+4>: subq $0x50, %rsp
0x10142de78 <+8>: movl %edi, -0xc(%rbp)
0x10142de7b <+11>: movq %rsi, -0x18(%rbp)
0x10142de7f <+15>: movq %rdx, -0x20(%rbp)
0x10142de83 <+19>: movq -0x20(%rbp), %rdi
0x10142de87 <+23>: callq 0x10046f360 ; v8::internal::Isolate::context at isolate.h:596
0x10142de8c <+28>: movb $0x1, %cl
Now that we know this we can disassemble the complete function using:
(lldb) dis -n v8::internal::Runtime_StackGuard
As well as any other runtime function we might be interested in later. How are these stored?
They are stored in a global named kIntrinsicFunctions
:
(lldb) target variable kIntrinsicFunctions
(lldb) expr kIntrinsicFunctions[v8::internal::Runtime::FunctionId::kStackGuard]
(const v8::internal::Runtime::Function) $227 = (function_id = kStackGuard, intrinsic_type = RUNTIME, name = "StackGuard", entry = "UH\xffffff89\xffffffe5H\xffffff83\xffffffecP\xffffff89}\xfffffff4H\xffffff89u\xffffffe8H\xffffff89U\xffffffe0H\xffffff8b}\xffffffe0\xffffffe8td\x04\xffffffff\xffffffb1\x01H\xffffff83\xfffffff8", nargs = '\0', result_size = '\x01')
(lldb) expr kIntrinsicFunctions[Builtins::Name::kStackCheck].entry
(v8::internal::Address) $229 = 0x0000000101440100 "UH\x89�H��P�}�H�u�H�U�H�}��td\x04��\x01H\x83�
When is this array populated?
Being a global it is part of the data section in the object file and will be loaded into memory upon start up.
Just to recap, first the address of Runtime_StackGuard
will be moved into register rbx
, and then we will jump CEntryStub
.
(lldb) job *isolate->builtins()->builtin_handle(Builtins::Name::kStackCheck)
0x302f34141da1: [Code]
kind = BUILTIN
name = StackCheck
compiler = unknown
Instructions (size = 17)
0x302f34141e00 0 33c0 xorl rax,rax
0x302f34141e02 2 48bbc000440101000000 REX.W movq rbx,0x1014400c0 ;; external reference (Runtime::StackGuard)
0x302f34141e0c c e92f29f4ff jmp 0x302f34084740 ;; code: STUB, CEntryStub, minor: 8
RelocInfo (size = 3)
0x302f34141e04 external reference (Runtime::StackGuard) (0x1014400c0)
0x302f34141e0d code target (STUB) (0x302f34084740)
(lldb) expr kIntrinsicFunctions[v8::internal::Runtime::FunctionId::kStackGuard].entry
(v8::internal::Address) $380 = 0x00000001014400c0 "UH\x89�H��P�}�H�u�H�U�H�}��d\x04��\x01H\x83�
We can see that the 0x1014400c0
match. And we can disassemble this address using:
(lldb) dis -s 0x1014400c0
node`v8::internal::Runtime_StackGuard:
0x1014400c0 <+0>: pushq %rbp
0x1014400c1 <+1>: movq %rsp, %rbp
0x1014400c4 <+4>: subq $0x50, %rsp
0x1014400c8 <+8>: movl %edi, -0xc(%rbp)
0x1014400cb <+11>: movq %rsi, -0x18(%rbp)
0x1014400cf <+15>: movq %rdx, -0x20(%rbp)
0x1014400d3 <+19>: movq -0x20(%rbp), %rdi
0x1014400d7 <+23>: callq 0x100486590 ; v8::internal::Isolate::context at isolate.h:596
0x1014400dc <+28>: movb $0x1, %cl
What about CEntryStub
that is being jumped to?
0x302f34141e0c c e92f29f4ff jmp 0x302f34084740 ;; code: STUB, CEntryStub, minor: 8
CEntryStub
is declared in deps/v8/src/code-stubs.h
:
class CodeStub : public ZoneObject {
...
};
class PlatformCodeStub : public CodeStub {
Handle<Code> GenerateCode() override;
...
};
class CEntryStub : public PlatformCodeStub {
...
};
Where is the CEntryStub stored?
jmp 0x302f34084740 ;; code: STUB, CEntryStub, minor: 8
(lldb) expr CodeStub::Major::CEntry
(int) $404 = 4
(lldb) expr CodeStub::MajorKeyFromKey(4)
(v8::internal::CodeStub::Major) $405 = CEntry
(lldb) expr isolate->heap()->code_stubs()->FindEntry(isolate, 4)
(int) $422 = 451
(lldb) job Code::cast(isolate->heap()->code_stubs()->ValueAt(451))
0x302f34284001: [Code]
kind = STUB
major_key = CEntryStub
compiler = unknown
Instructions (size = 327)
0x302f34284060 0 55 push rbp
0x302f34284061 1 4889e5 REX.W movq rbp,rsp
0x302f34284064 4 6a06 push 0x6
0x302f34284066 6 6a00 push 0x0
0x302f34284068 8 49ba014028342f300000 REX.W movq r10,0x302f34284001
0x302f34284072 12 4152 push r10
0x302f34284074 14 4c8bf0 REX.W movq r14,rax
0x302f34284077 17 49ba001a000601000000 REX.W movq r10,0x106001a00
0x302f34284081 21 49892a REX.W movq [r10],rbp
0x302f34284084 24 49ba9019000601000000 REX.W movq r10,0x106001990
0x302f3428408e 2e 498932 REX.W movq [r10],rsi
0x302f34284091 31 49ba101a000601000000 REX.W movq r10,0x106001a10
0x302f3428409b 3b 49891a REX.W movq [r10],rbx
0x302f3428409e 3e 4e8d7cf508 REX.W leaq r15,[rbp+r14*8+0x8]
0x302f342840a3 43 4883e4f0 REX.W andq rsp,0xf0
0x302f342840a7 47 488965f0 REX.W movq [rbp-0x10],rsp
0x302f342840ab 4b 40f6c40f testb rsp,0xf
0x302f342840af 4f 7401 jz 0x302f342840b2 <+0x52>
0x302f342840b1 51 cc int3l
0x302f342840b2 52 498bfe REX.W movq rdi,r14
0x302f342840b5 55 498bf7 REX.W movq rsi,r15
0x302f342840b8 58 48ba0000000601000000 REX.W movq rdx,0x106000000
0x302f342840c2 62 ffd3 call rbx
0x302f342840c4 64 493b8588000000 REX.W cmpq rax,[r13+0x88]
0x302f342840cb 6b 0f8447000000 jz 0x302f34284118 <+0xb8>
0x302f342840d1 71 4d8b75a8 REX.W movq r14,[r13-0x58]
0x302f342840d5 75 49baa019000601000000 REX.W movq r10,0x1060019a0
0x302f342840df 7f 4d3b32 REX.W cmpq r14,[r10]
0x302f342840e2 82 7401 jz 0x302f342840e5 <+0x85>
0x302f342840e4 84 cc int3l
0x302f342840e5 85 488b4d08 REX.W movq rcx,[rbp+0x8]
0x302f342840e9 89 488b6d00 REX.W movq rbp,[rbp+0x0]
0x302f342840ed 8d 498d6708 REX.W leaq rsp,[r15+0x8]
0x302f342840f1 91 51 push rcx
0x302f342840f2 92 49ba9019000601000000 REX.W movq r10,0x106001990
0x302f342840fc 9c 498b32 REX.W movq rsi,[r10]
0x302f342840ff 9f 49c70200000000 REX.W movq [r10],0x0
0x302f34284106 a6 49ba001a000601000000 REX.W movq r10,0x106001a00
0x302f34284110 b0 49c70200000000 REX.W movq [r10],0x0
0x302f34284117 b7 c3 retl
0x302f34284118 b8 48c7c700000000 REX.W movq rdi,0x0
0x302f3428411f bf 48c7c600000000 REX.W movq rsi,0x0
0x302f34284126 c6 48ba0000000601000000 REX.W movq rdx,0x106000000
0x302f34284130 d0 4989e2 REX.W movq r10,rsp
0x302f34284133 d3 4883ec08 REX.W subq rsp,0x8
0x302f34284137 d7 4883e4f0 REX.W andq rsp,0xf0
0x302f3428413b db 4c891424 REX.W movq [rsp],r10
0x302f3428413f df 48b8b0aa430101000000 REX.W movq rax,0x10143aab0
0x302f34284149 e9 40f6c40f testb rsp,0xf
0x302f3428414d ed 7401 jz 0x302f34284150 <+0xf0>
0x302f3428414f ef cc int3l
0x302f34284150 f0 ffd0 call rax
0x302f34284152 f2 488b2424 REX.W movq rsp,[rsp]
0x302f34284156 f6 49bab019000601000000 REX.W movq r10,0x1060019b0
0x302f34284160 100 498b32 REX.W movq rsi,[r10]
0x302f34284163 103 49bad019000601000000 REX.W movq r10,0x1060019d0
0x302f3428416d 10d 498b22 REX.W movq rsp,[r10]
0x302f34284170 110 49bac819000601000000 REX.W movq r10,0x1060019c8
0x302f3428417a 11a 498b2a REX.W movq rbp,[r10]
0x302f3428417d 11d 4885f6 REX.W testq rsi,rsi
0x302f34284180 120 7404 jz 0x302f34284186 <+0x126>
0x302f34284182 122 488975f8 REX.W movq [rbp-0x8],rsi
0x302f34284186 126 49bab819000601000000 REX.W movq r10,0x1060019b8
0x302f34284190 130 498b3a REX.W movq rdi,[r10]
0x302f34284193 133 49bac019000601000000 REX.W movq r10,0x1060019c0
0x302f3428419d 13d 498b12 REX.W movq rdx,[r10]
0x302f342841a0 140 488d7c175f REX.W leaq rdi,[rdi+rdx*1+0x5f]
0x302f342841a5 145 ffe7 jmp rdi
RelocInfo (size = -1377915637)
(lldb) expr Code::cast(isolate->heap()->code_stubs()->ValueAt(451))->entry()
(byte *) $427 = 0x0000302f34284060
These are not the same stubs, the opcodes match but not the entry addresses
IGNITION_HANDLER(StackCheck, InterpreterAssembler) {
Label ok(this), stack_check_interrupt(this, Label::kDeferred);
Node* interrupt = StackCheckTriggeredInterrupt();
Branch(interrupt, &stack_check_interrupt, &ok);
BIND(&ok);
Dispatch();
BIND(&stack_check_interrupt);
{
Node* context = GetContext();
CallRuntime(Runtime::kStackGuard, context);
Dispatch();
}
}
void MacroAssembler::CallRuntime(const Runtime::Function* f,
int num_arguments,
SaveFPRegsMode save_doubles) {
// If the expected number of arguments of the runtime function is
// constant, we check that the actual number of arguments match the
// expectation.
CHECK(f->nargs < 0 || f->nargs == num_arguments);
// TODO(1236192): Most runtime routines don't need the number of
// arguments passed in because it is constant. At some point we
// should remove this need and make the runtime routine entry code
// smarter.
Set(rax, num_arguments);
LoadAddress(rbx, ExternalReference(f, isolate()));
CEntryStub ces(isolate(), f->result_size, save_doubles);
CallStub(&ces);
}
But, normally these will be called at compile time.
My thinking is that these are all callstubs (builtins/runtime), that are generated at compile time and then made availalbe
in memory somewhere allowing V8 to jump to the address of the first instruction and start executing it.
But how are these loaded into memory?
This is done as Isolate initialization. Lets set the following break point we can follow this process:
(lldb) br s -n Isolate::Init
(lldb) br s -f isolate.cc -l 2703
bool Isolate::Init(StartupDeserializer* des) {
...
isolate_addresses_[IsolateAddressId::kHandlerAddress] = reinterpret_cast<Address>(handler_address());
isolate_addresses_[IsolateAddressId::kCEntryFPrAddress] = reinterpret_cast<Address>(centry_fp_address());
isolate_addresses_[IsolateAddressId::kFunctionAddress] = reinterpret_cast<Address>(function_address());
isolate_addresses_[IsolateAddressId::kContextAddress] = reinterpret_cast<Address>(context_address());
isolate_addresses_[IsolateAddressId::kPendingExceptionAddress] = reinterpret_cast<Address>(pending_exception_address());
isolate_addresses_[IsolateAddressId::kPendingHandlerContextAddress] = reinterpret_cast<Address>(pending_handler_context_address());
isolate_addresses_[IsolateAddressId::kPendingHandlerCodeAddress] = reinterpret_cast<Address>(pending_handler_code_address());
isolate_addresses_[IsolateAddressId::kPendingHandlerOffsetAddress] = reinterpret_cast<Address>(pending_handler_offset_address());
isolate_addresses_[IsolateAddressId::kPendingHandlerFPAddress] = reinterpret_cast<Address>(pending_handler_fp_address());
isolate_addresses_[IsolateAddressId::kPendingHandlerSPAddress] = reinterpret_cast<Address>(pending_handler_sp_address());
isolate_addresses_[IsolateAddressId::kExternalCaughtExceptionAddress] = reinterpret_cast<Address>(external_caught_exception_address());
isolate_addresses_[IsolateAddressId::kJSEntrySPAddress] = reinterpret_cast<Address>(js_entry_sp_address());
...
InitializeThreadLocal();
...
if (!create_heap_objects) des->DeserializeInto(this);
The functions, like handler_address(), can be found in v8/src/isolate.h:
inline Address* handler_address() { return &thread_local_top_.handler_; }
Notice that when the isolate_addresses_ array is populated and when InitializeThreadLocal is call the heap is still empty. What is
happening is that pointers are being setup. When DeserializeInto(this)
is called is when the heap is populated:
void StartupDeserializer::DeserializeInto(Isolate* isolate) {
Initialize(isolate);
BuiltinDeserializer builtin_deserializer(isolate, builtin_data_);
...
builtin_deserializer.DeserializeEagerBuiltins();
...
CodeStub::GenerateFPStubs(this);
StoreBufferOverflowStub::GenerateFixedRegStubsAheadOfTime(this);
}
DeserializeEagerBuiltins will populate the builtins_ array which is part of Builtins which is a member of Isolate. There seems to be codestubs that have to be generated, they cannot be serialized into the snapshot.
CodeStub::GenerateFPStubs(this)
, it is here that CEntryStub is generated:
void CodeStub::GenerateFPStubs(Isolate* isolate) {
// Generate if not already in cache.
SaveFPRegsMode mode = kSaveFPRegs;
CEntryStub(isolate, 1, mode).GetCode();
StoreBufferOverflowStub(isolate, mode).GetCode();
}
We can check that check the various data structures before and after using:
(lldb) expr builtins_
(lldb) expr heap->roots_
(lldb) expr isolate_addresses_
(lldb) expr thread_local_top_
For isolate_addresses_
which are pointers we can inspect them like this:
(lldb) expr isolate_addresses_[11]
(v8::internal::Address) $194 = 0x0000000104808820 ""
(lldb) memory read -f x -c 1 -s 8 0x0000000104808820
0x104808820: 0x0000000000000000
(lldb) expr CodeStub::Major::CEntry
(int) $120 = 4
Local<Value> result = script.ToLocalChecked()->Run();
If we back down through the call frame again, we will be in execution.cc
and see this now familar code:
typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
Object* receiver, int argc,
Object*** args);
Handle<Code> code = is_construct
? isolate->factory()->js_construct_entry_code()
: isolate->factory()->js_entry_code();
...
// start of the function identified by code->entry() address.
JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
// Call the function through the right JS entry stub.
Object* orig_func = *new_target;
Object* func = *target;
Object* recv = *receiver;
Object*** argv = reinterpret_cast<Object***>(args);
if (FLAG_profile_deserialization && target->IsJSFunction()) {
PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
}
RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
}
Notice that code
will be what isolate->factory()->js_entry_code()
returns:
(lldb) p isolate->factory()->js_entry_code()
You can verify that it is in fact bootstrap.js that func
represents by issueing:
(lldb) job func
Handle<Code> code represents generated machine code, and we can see that this instance is on Kind
STUB`:
(lldb) p code->kind()
(v8::internal::Code::Kind) $32 = STUB
Next, we have CALL_GENERATED_CODE
which can be found in src/x64/simulator-x64.h
:
// Since there is no simulator for the x64 architecture the only thing we can
// do is to call the entry directly.
// TODO(X64): Don't pass p0, since it isn't used?
#efine CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
(entry(p0, p1, p2, p3, p4))
So this will be an call that looks like this:
entry(orig_func, func, recv, argc, argv);
0x100e381b4 <+1348>: movq -0x118(%rbp), %rdx // move the value of address of code->entry() into rdx
0x100e381bb <+1355>: movq -0x120(%rbp), %rdi // first argument which is orig_func
-> 0x100e381c2 <+1362>: movq -0x128(%rbp), %rsi // second argument which is func
0x100e381c9 <+1369>: movq -0x130(%rbp), %rcx // third argument which is recv
0x100e381d0 <+1376>: movl -0x30(%rbp), %eax // fourth arg which is argc
0x100e381d3 <+1379>: movq -0x138(%rbp), %r8 // fifth argument which is argv
0x100e381da <+1386>: movq %rdx, -0x1c0(%rbp) // move code->entry from rdx into local varialbe
0x100e381e1 <+1393>: movq %rcx, %rdx // move recv into rdx
0x100e381e4 <+1396>: movl %eax, %ecx // move argc into ecx
0x100e381e6 <+1398>: movq -0x1c0(%rbp), %r9 // move code-entry into register r9
0x100e381ed <+1405>: callq *%r9 // call address in register 9
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x118`
0x7fff5fbfd828: 0x0000302f34084060
(lldb) expr code->entry()
(byte *) $186 = 0x0000302f34084060
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x120`
0x7fff5fbfd820: 0x00001d64e5d822e1
(lldb) expr orig_func
(v8::internal::Object *) $209 = 0x00001d64e5d822e1
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x128`
0x7fff5fbfd818: 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $211 = 0x00001d6417d30669
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x130`
0x7fff5fbfd810: 0x00001d64f0a82239
(lldb) expr recv
(v8::internal::Object *) $213 = 0x00001d64f0a82239
(lldb) memory read -f x -c 1 -s 4 `$rbp - 0x30`
0x7fff5fbfd910: 0x00000000
(lldb) expr argc
(int) $216 = 0
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x138`
0x7fff5fbfd808: 0x0000000000000000
(lldb) expr argv
(v8::internal::Object ***) $219 = 0x0000000000000000
(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x1c0`
0x7fff5fbfd780: 0x0000302f34084060
The last instruction callq *%r9
will call into JSEntryStub:
-> 0x302f34084060: pushq %rbp // push callers base frame pointer saving it so we can restore it
0x302f34084061: movq %rsp, %rbp // mov the current value of rsp to into rbp which will be the frame pointer for this function
0x302f34084064: pushq $0x2 // this is pushing an immediate value 2 onto the stack. Where does this come from, the following:
(lldb) p v8::internal::StackFrame::TypeToMarker(static_cast<v8::internal::StackFrame::Type>(v8::internal::StackFrame::Type::ENTRY))
(int32_t) $262 = 2
0x302f34084066: movabsq $0x106001990, %r10 // move the context address into r10
(lldb) expr isolate->isolate_addresses_[IsolateAddressId::kContextAddress]
(v8::internal::Address) $230 = 0x0000000106001990
0x302f34084070: movq (%r10), %r10 // the context address is a pointer, this will dereference it
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
(lldb) register read r10
r10 = 0x00001d6417d03a59
0x302f34084073: pushq %r10 // push the dereferences context onto the stack
0x302f34084075: pushq %r12 // r12 must be preserved accross function calls so save it and pop it later before returning
0x302f34084077: pushq %r13 // r13 must be preserved accross function calls so save it and pop it later before returning
0x302f34084079: pushq %r14 // r14 must be preserved accross function calls so save it and pop it later before returning
0x302f3408407b: pushq %r15 // r14 must be preserved accross function calls so save it and pop it later before returning
0x302f3408407d: pushq %rbx // rbx must be preserved accross function calls so save it and pop it later before returning
0x302f3408407e: movabsq $0x106000048, %r13 // move the value of the roots_array_start into r13
(lldb) expr isolate->heap()->roots_array_start()
(v8::internal::Object **) $233 = 0x0000000106000048
0x302f34084088: addq $0x80, %r13 // addp(kRootRegister, Immediate(kRootRegisterBias)); kRootRegisterBias is 128
0x302f3408408f: movabsq $0x106001a00, %r10 // move he CEntryFPAddress into r10
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kCEntryFPAddress]
0x106001a00: 0x0000000000000000
0x302f34084099: pushq (%r10) // dereference CEntryAddress and push onto the stack
(lldb) memory read -f x -c 1 -s 8 0x0000000106001a00
0x106001a00: 0x0000000000000000
0x302f3408409c: movabsq 0x106001a20, %rax // move JSEntrySPAddress into rax
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
0x106001a20: 0x0000000000000000
0x302f340840a6: testq %rax, %rax // is rax zero? if so there is an outer js call
(lldb) register read rax
rax = 0x0000000000000000
0x302f340840af: pushq $0x2 // v8::internal::StackFrame::OUTERMOST_JSENTRY_FRAME))
0x302f340840b1: movq %rbp, %rax // move this functions base pointer into rax
0x302f340840b4: movabsq %rax, 0x106001a20 // store the base pointer in isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x0000000000000000
after
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x00007fff5fbfd760
(lldb) register read rbp
rbp = 0x00007fff5fbfd940
0x302f340840be: jmp 0x302f340840c5
0x302f340840c5: jmp 0x302f340840e0
0x302f340840e0: movabsq $0x106001a08, %r10 // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kHandlerAddress]`
0x106001a08: 0x0000000000000000
0x302f340840ea: pushq (%r10) // push the dereferenced handler address
0x302f340840ed: movabsq $0x106001a08, %r10 // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
0x302f340840f7: movq %rsp, (%r10) // move the value of the current stack pointer into the object pointed to be r10
(lldb) memory read -f x -c 1 -s 8 0x106001a08
0x106001a08: 0x00007fff5fbfd710
(lldb) register read rsp
rsp = 0x00007fff5fbfd710
0x302f340840fa: callq 0x302f341418c0 // call JSEntryTrampoline builtin
(lldb) expr isolate->builtins()->builtin_handle(Builtins::Name::kJSEntryTrampoline)->entry()
(byte *) $255 = 0x0000302f341418c0
In deps/v8/src/builtins/x64/builtins-x64.cc
we can find Generate_JSEntryTrampolineHelper
which is what generates the builtin. As this is done
a compile time we can put a break point in it (you can debug mksnapshot though which is done elsewhere in this document).
0x302f341418c0: movq %rdi, %r11 // move the orig_fun into r11
0x302f341418c3: movq %rsi, %rdi // move func into rdi
0x302f341418c6: xorl %esi, %esi // zero out esi
0x302f341418c8: pushq %rbp // EnterFrame (deps/v8/src/x64/macro-assembler-x64.cc)
0x302f341418c9: movq %rsp, %rbp //
0x302f341418cc: pushq $0x1c // push 0x1c (decimal 28)
(lldb) p v8::internal::StackFrame::TypeToMarker(static_cast<v8::internal::StackFrame::Type>(v8::internal::StackFrame::Type::INTERNAL))
(int32_t) $256 = 28
0x302f341418ce: movabsq $0x302f34141861, %r10 // CodeObject() what is this, seems like it is Handle<HeapObject> which will be patched later
// I think this might be the register file?
(lldb) memory read -f x -c 1 -s 8 0x302f34141861
0x302f34141861: 0x6900001d64ceb827
0x302f341418d8: pushq %r10 // push the HandleHeapObject into the stack
0x302f341418da: movabsq $0x1d64e5d822e1, %r10 // move undefined value into r10
(lldb) expr *isolate->factory()->undefined_value()
(v8::internal::Oddball *) $264 = 0x00001d64e5d822e1
0x302f341418e4: cmpq %r10, (%rsp)
0x302f341418e8: jne 0x302f341418fa // last of EnterFrame if we don't abort that is
0x302f341418fa: movabsq $0x106001990, %r10 // move the ContextAdress into r10 (the scratch register for x86)
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
0x302f34141904: movq (%r10), %rsi // deref and move into context into rsi
0x302f34141907: pushq %rdi // push function onto the stack
0x302f34141908: pushq %rdx // push recv onto the stack (this was moved above with 0x302f34141908: pushq %rdx)
0x302f34141909: movq %rcx, %rax // move argc into rax (this was moved above with 0x100e381e4 <+1396>: movl %eax, %ecx)
0x302f3414190c: movq %r8, %rbx // move argv into rbx
0x302f3414190f: movq %r11, %rdx // move orig_func into rdx
// Generate_CheckStackOverflow TODO: got through this as I struggled to understand/map the generated instructions to the source code :(
0x302f34141912: movq 0xd08(%r13), %r10 //
0x302f34141919: movq %rsp, %rcx
0x302f3414191c: subq %r10, %rcx
0x302f3414191f: movq %rax, %r11
0x302f34141922: shlq $0x3, %r11
0x302f34141926: cmpq %r11, %rcx
0x302f34141929: jg 0x302f34141940
// Generate_JSEntryTrampolineHelper
0x302f34141940: xorl %ecx, %ecx // zero out ecx, for the following loop
0x302f34141942: jmp 0x302f3414194f // jump to entry label
0x302f3414194f: cmpq %rax, %rcx // compare argc with rcx (this was moved above).
0x302f34141952: jne 0x302f34141944 // entry the loop if. This is pushing the argv values onto the stack in a loop, but we don't have any so we fall through
0x302f34141954: callq 0x302f3413c400 //
(lldb) expr isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $279 = 0x0000302f3413c400 "@\xfffffff6\xffffffc7\x01\x0f\xffffff84F"
// for the full object with assembly language instructions:
(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// Builtins::Generate_Call_ReceiverIsAny 'deps/v8/src/builtins/builtins-call-gen.cc':
// void Builtins::Generate_Call_ReceiverIsAny(MacroAssembler* masm) {
// Generate_Call(masm, ConvertReceiverMode::kAny);
// }
// Builtins::Generate_Call `deps/v8/src/builtins/x64/builtins-x64.cc`
0x302f3413c400: testb $0x1, %dil // move the immediate 1 into the low 8 bits or rdi // __ JumpIfSmi(rdi, &non_callable); which consists of CheckSmi
0x302f3413c404: je 0x302f3413c450 // if ZF = 1 then jump. This would happen if rdi was a SMI// __ JumpIfSmi(rdi, &non_callable: which after CheckSmi will jump. rdi is the target and not a smi in our case
// recall that rdi is the function:
(lldb) register read rdi
rdi = 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $280 = 0x00001d6417d30669
// Now I think that func is/was of type JSFunction (deps/v8/src/objects.h) as it was cast to Object* by:
// auto fun = i::Handle<i::JSFunction>::cast(Utils::OpenHandle(this));
// class JSFunction: public JSObject {
public:
// [prototype_or_initial_map]:
DECL_ACCESSORS(prototype_or_initial_map, Object)
0x302f3413c40a: movq -0x1(%rdi), %rcx // move the map into rcx. CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx). HeapObject::kMapOffset which is the first field in JSFunction:
(lldb) memory read -f x -c 1 -s 8 `$rdi - 1`
0x1d6417d30668: 0x00001d6433c82521
(lldb) expr JSFunction::cast(func)->prototype_or_initial_map()
(v8::internal::Object *) $286 = 0x00001d64e5d82321
0x302f3413c40e: cmpb $-0x1, 0xb(%rcx) // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
0x302f3413c412: je 0x302f3413be20 // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
// I was not sure where to find this `j` function but it is in src/x64/assembler-x64.cc (Assembler::j but note
// that there are multiple overloaded j functions so make sure you are looking at the correct one. There is
// as section that discusses Assembler::j in detail later in this document.
// so what are we jumping to? We can back up in the debugger and find out:
// (lldb) up 2
// (lldb job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// This will be a builtin named CallFunction_ReceiverIsAny, and being a builtin you'll find it in
// `src/builtins/builtins-definitions.h`. The implementation will be in `src/builtins/builtins-call.cc` and
// will result in `return builtin_handle(kCallFunction_ReceiverIsAny)`. This code repsonsible for generating
// is Builtins::Generate_CallFunction and in our case that means `src/builtins/x64/builtins-x64.cc`
(lldb) expr isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $317 = 0x0000302f3413be20
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc)
0x302f3413be20: testb $0x1, %dil // AssertFunction (deps/v8/src/x64/macro-assembler-x64.cc) check if rdi is of type smi (testb(object, Immediate(kSmiTagMask));)
// rdi is the function to call:
(lldb) register read rdi
rdi = 0x00001d6417d30669
(lldb) expr JSFunction::cast(func)
(v8::internal::JSFunction *) $320 = 0x00001d6417d30669
0x302f3413be24: jne 0x302f3413be36 // this is also generated by AssertFunciton and the call to Assembler::j If not equal this code will fall through
0x302f3413be36: pushq %rdi // AssertFunction still, push rdi (the function) onto the stack
0x302f3413be37: movq -0x1(%rdi), %rdi // move the map into rdi. CmpObjectType(object, JS_FUNCTION_TYPE, object). HeapObject::kMapOffset which is the first field in JSFunction
0x302f3413be3b: cmpb $-0x1, 0xb(%rdi) // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
0x302f3413be3f: popq %rdi // pop function from stack into rdi again
0x302f3413be40: je 0x302f3413be52 // jump if equal will jump to a L label and return from the Check call and then return from AssertFunction
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc)
0x302f3413be52: movq 0x1f(%rdi), %rdx // move the SharedFunctionInfo into rdx:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x1f`
0x1d6417d30688: 0x00001d6417d2dc21
(lldb) expr JSFunction::cast(func)->shared()
(v8::internal::SharedFunctionInfo *) $325 = 0x00001d6417d2dc21
0x302f3413be56: testb $-0x20, 0x87(%rdx) // testl(FieldOperand(rdx, SharedFunctionInfo::kCompilerHintsOffset)
(lldb) expr JSFunction::cast(func)->shared()->compiler_hints()
(int) $332 = 1056770
(lldb) memory read -f dec -c 1 -s 4 `($rdx + 0x87)`
0x1d6417d2dca8: 1056770
0x302f3413be5d: jne 0x302f3413bfbc // __ j(not_zero, &class_constructor);
0x302f3413be63: movq 0x27(%rdi), %rsi // move the functions context info rsi:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x27`
0x1d6417d30690: 0x00001d6417d03a59
(lldb) expr JSFunction::cast(func)->context()
(v8::internal::Context *) $345 = 0x00001d6417d03a59
0x302f3413be67: testb $0x3, 0x87(%rdx) // check SharedFunctionInfo::IsNativeBit::kMask | SharedFunctionInfo::IsStrictBit::kMask
0x302f3413be6e: jne 0x302f3413bf14 //
0x302f3413bf14: movslq 0x73(%rdx), %rbx // __ movsxlq(rbx, FieldOperand(rdx, SharedFunctionInfo::kFormalParameterCountOffset))
0x302f3413bf18: movabsq $0x105300b92, %r10 // move hook_on_function_call_address into scratch register; part of CheckDebugHook
(lldb) expr isolate->debug()->hook_on_function_call_address()
(v8::internal::Address) $353 = 0x0000000105300b92
0x302f3413bf22: cmpb $0x0, (%r10) // how is this generate? In CheckDebugHook I can only find cmpb(debug_hook_active_operand, Immediate(0)); but not he previous moveabsq
0x302f3413bf26: je 0x302f3413bfa4 // will jump to the label at the end of CheckDebugHook
// MacroAssembler::InvokeFunction
0x302f3413bfa4: movq -0x60(%r13), %rdx // move the UndefinedValueRootIndex into rdx generated by LoadRoot(rdx, Heap::kUndefinedValueRootIndex);
(lldb) memory read -f x -c 1 -s 8 `$r13 - 0x60`
0x106000068: 0x00001d64e5d822e1
lldb) expr isolate->heap()->roots_[Heap::RootListIndex::kUndefinedValueRootIndex]
(v8::internal::Object *) $354 = 0x00001d64e5d822e1
// InvokePrologue(expected, actual, &done, &definitely_mismatches, flag, Label::kNear)
0x302f3413bfa8: cmpq %rax, %rbx // Set(rax, actual.immediate());
0x302f3413bfab: je 0x302f3413bfb2 // will return from InvokePrologue
0x302f3413bfb2: movq 0x37(%rdi), %rcx // move the function code into rcx (movp(rcx, FieldOperand(function, JSFunction::kCodeOffset)))
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x37`
0x1d6417d306a0: 0x0000302f34144281
(lldb) expr JSFunction::cast(func)->code()
(v8::internal::Code *) $358 = 0x0000302f34144281
0x302f3413bfb6: addq $0x5f, %rcx // addp(rcx, Immediate(Code::kHeaderSize - kHeapObjectTag)); the instruction start follows the Code object header
(lldb) job JSFunction::cast(func)->code()
0x302f34144281: [Code]
yind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x302f341442e0 0 488b5f2f REX.W movq rbx,[rdi+0x2f]
(lldb) memory read -f x -c 1 -s8 `$rcx + 0x5f`
0x302f341442e0: 0x075b8b482f5f8b48
// notice that rcx now point to the first instruction.
0x302f3413bfba: jmpq *%rcx // now jump to the first instruction of function :)
0x302f341442e0: movq 0x2f(%rdi), %rbx // move the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x2f`
0x1d6417d30698: 0x00001d6417d306e9
(lldb) expr JSFunction::cast(func)->feedback_vector()
(v8::internal::FeedbackVector *) $363 = 0x00001d6417d306a9
(lldb) job JSFunction::cast(func)->feedback_vector()
0x1d6417d306a9: [FeedbackVector] in OldSpace
- length: 1
SharedFunctionInfo: 0x1d6417d2dc21 <SharedFunctionInfo>
Optimized Code: 0
Invocation Count: 0
Profiler Ticks: 0
Slot #0 kCreateClosure
[0]: 0x1d6417d306d9 <Cell value= 0x1d64e5d822e1 <undefined>>
302f341442e4: movq 0x7(%rbx), %rbx // move Slot[0] from the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0x7`
0x1d6417d306f0: 0x00001d6417d306a9
// MaybeTailCallOptimizedCodeSlot(masm, feedback_vector, rcx, r14, r15);
// MaybeTailCallOptimizedCodeSlot(MacroAssembler* masm, Register feedback_vector, Register scratch1, Register scratch2, Register scratch3)
// Register closure = rdi;
// Register optimized_code_entry = rcx;
0x302f341442e8: movq 0xf(%rbx), %rcx // __ movp(optimized_code_entry, FieldOperand(feedback_vector, FeedbackVector::kOptimizedCodeOffset));
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0xf`
0x1d6417d306b8: 0x0000000000000000
(lldb) expr JSFunction::cast(func)->feedback_vector()->optimized_code()
(v8::internal::Code *) $367 = 0x0000000000000000
0x302f341442ec: testb $0x1, %cl // smi test against low 8 bit of rcx called from JumpIfNotSmi -> CheckSmi
0x302f341442ef: jne 0x302f34144486 // JumpIfNotSmi -> Assembler::j
0x302f341442f5: testb $0x1, %cl // test again but this time from MacroAssembler::SmiCompare and its call to AssertSmi which calls CheckSmi
0x302f341442f8: je 0x302f3414430a // SmiCompare -> AssertSmi -> Check
RelocInf (size = 10) 0x302f341418d0 embedded object (0x302f34141861 <Code BUILTIN>)
0x302f341418dc embedded object (0x1d64e5d822e1 <undefined>)
0x302f341418f5 code target (BUILTIN) (0x302f341659c0)
0x302f341418fc external reference (Isolate::context_address) (0x106001990)
0x302f34141933 external reference (Runtime::ThrowStackOverflow) (0x101438e80)
0x302f3414193c code target (STUB) (0x302f34084740)
0x302f34141955 code target (BUILTIN) (0x302f3413c400)
0x302f3414196b code target (BUILTIN) (0x302f341659c0)
SharedFunctionInfo::kCompilerHintsOffset
SHARED_FUNCTION_INFO_FIELDS)
JSEntryStub
Process 568 stopped
* thread #1: tid = 0xf3b301, 0x000020819e104060, queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x000020819e104060
-> 0x20819e104060: pushq %rbp
0x20819e104061: movq %rsp, %rbp
0x20819e104064: pushq $0x2
0x20819e104066: movabsq $0x104803190, %r10 ; imm = 0x104803190
I've previously mapped the assembly with the void JSEntryStub::Generate
function but I did not cover:
0x20819e1040fa 9a e8c1d70b00 call 0x20819e1c18c0 (JSEntryTrampoline) ;; code: BUILTIN
This matches the assembly generated in Generate_JSEntryTrampolineHelper(MacroAssembler* masm, bool is_construct)
in src/builtins/x64/builtins-x64.cc
:
frame #0: 0x000020819e1c18c0
-> 0x20819e1c18c0: movq %rdi, %r11 // __ movp(r11, rdi);
0x20819e1c18c3: movq %rsi, %rdi // __ movp(rdi, rsi);
0x20819e1c18c6: xorl %esi, %esi // __ Set(rsi, 0);
0x20819e1c18c8: pushq %rbp // FrameScope scope(masm, StackFrame::INTERNAL);
0x20819e1c18c9: movq %rsp, %rbp // FrameScope scope(masm, StackFrame::INTERNAL);
0x20819e1c18cc: pushq $0x1c // Is this also generated by the above scope?
0x20819e1c18ce: movabsq $0x20819e1c1861, %r10 // Is this also generated by the above scope?
...
-> 0x20819e1c1904: movq (%r10), %rsi //
0x20819e1c1907: pushq %rdi // __ Push(rdi); func onto the stack
0x20819e1c1908: pushq %rdx // __ Push(rdx); recv onto the stack
0x20819e1c1909: movq %rcx, %rax // __ movp(rax, rcx); argc
0x20819e1c190c: movq %r8, %rbx // __ movp(rbx, r8); pointer to args
0x20819e1c190f: movq %r11, %rdx // __ movp(rdx, r11); new target into rdx
0x20819e1c1912: movq 0xd08(%r13), %r10 // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
0x20819e1c1919: movq %rsp, %rcx // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
0x20819e1c191c: subq %r10, %rcx // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
0x20819e1c191f: movq %rax, %r11 // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
0x20819e1c1922: shlq $0x3, %r11 // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
0x20819e1c1926: cmpq %r11, %rcx // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
0x20819e1c1929: jg 0x20819e1c1940 // jump if alright
0x20819e1c192f: xorl %eax, %eax // handle stack overflow
0x20819e1c1931: movabsq $0x1014259b0, %rbx // handle stack overflow
0x20819e1c193b: callq 0x20819e104740 // handle stack overflow
-> 0x20819e1c1940: xorl %ecx, %ecx // __ Set(rcx, 0); // Set loop variable to 0.
0x20819e1c1942: jmp 0x20819e1c194f // __ jmp(&entry, Label::kNear);
-> 0x20819e1c194f: cmpq %rax, %rcx // both are zero at this stage
-> 0x20819e1c1952: jne 0x20819e1c1944
// Handle<Code> builtin = is_construct
// ? BUILTIN_CODE(masm->isolate(), Construct)
// : masm->isolate()->builtins()->Call();
0x20819e1c1954: callq 0x20819e1bc400 //__ Call(builtin, RelocInfo::CODE_TARGET);
Lets take a look Call
in src/x64/macro-assembler-x64.h
:
void Call(Handle<Code> code_object, RelocInfo::Mode rmode);
And the implementation look like this:
void TurboAssembler::Call(Handle<Code> code_object, RelocInfo::Mode rmode) {
#ifdef DEBUG
int end_position = pc_offset() + CallSize(code_object);
#endif
DCHECK(RelocInfo::IsCodeTarget(rmode));
call(code_object, rmode);
#ifdef DEBUG
DCHECK_EQ(end_position, pc_offset());
#endif
}
I think that call
will end up in src/x64/assembler-x64.cc
:
void Assembler::call(Handle<Code> target, RelocInfo::Mode rmode) {
EnsureSpace ensure_space(this);
// 1110 1000 #32-bit disp.
emit(0xE8);
emit_code_target(target, rmode);
}
src/x64/assembler-x64-inl.h
emit(0xE8)
RUNTIME_FUNCTION
RUNTIME_FUNCTION(Runtime_InterpreterNewClosure) {
HandleScope scope(isolate);
DCHECK_EQ(4, args.length());
CONVERT_ARG_HANDLE_CHECKED(SharedFunctionInfo, shared, 0);
CONVERT_ARG_HANDLE_CHECKED(FeedbackVector, vector, 1);
CONVERT_SMI_ARG_CHECKED(index, 2);
CONVERT_SMI_ARG_CHECKED(pretenured_flag, 3);
Handle<Context> context(isolate->context(), isolate);
FeedbackSlot slot = FeedbackVector::ToSlot(index);
Handle<Cell> vector_cell(Cell::cast(vector->Get(slot)), isolate);
return *isolate->factory()->NewFunctionFromSharedFunctionInfo(
shared, context, vector_cell,
static_cast<PretenureFlag>(pretenured_flag));
}
deps/v8/src/arguments.h
we find the RUNTIME_FUNCTION
macro:
#define RUNTIME_FUNCTION_RETURNS_TYPE(Type, Name) \
static INLINE(Type __RT_impl_##Name(Arguments args, Isolate* isolate)); \
\
V8_NOINLINE static Type Stats_##Name(int args_length, Object** args_object, \
Isolate* isolate) { \
RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::Name); \
TRACE_EVENT0(TRACE_DISABLED_BY_DEFAULT("v8.runtime"), \
"V8.Runtime_" #Name); \
Arguments args(args_length, args_object); \
return __RT_impl_##Name(args, isolate); \
} \
\
Type Name(int args_length, Object** args_object, Isolate* isolate) { \
DCHECK(isolate->context() == nullptr || isolate->context()->IsContext()); \
CLOBBER_DOUBLE_REGISTERS(); \
if (V8_UNLIKELY(FLAG_runtime_stats)) { \
return Stats_##Name(args_length, args_object, isolate); \
} \
Arguments args(args_length, args_object); \
return __RT_impl_##Name(args, isolate); \
} \
\
static Type __RT_impl_##Name(Arguments args, Isolate* isolate)
#define RUNTIME_FUNCTION(Name) RUNTIME_FUNCTION_RETURNS_TYPE(Object*, Name)
So lets see what that expands to:
static INLINE(Type __RT_impl_Runtime_InterpreterNewClosure(Arguments args, Isolate* isolate));
V8_NOINLINE static Object* Stats_InterpreterNewClosure(int args_length, Object** args_object, Isolate* isolate) {
RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::Name);
TRACE_EVENT0(TRACE_DISABLED_BY_DEFAULT("v8.runtime"), "V8.Runtime_" InterpreterNewClosure);
Arguments args(args_length, args_object);
return __RT_impl_InterpreterNewClosure(args, isolate);
}
Object* Runtime_InterpreterNewClosure(int args_length, Object** args_object, Isolate* isolate) {
DCHECK(isolate->context() == nullptr || isolate->context()->IsContext());
CLOBBER_DOUBLE_REGISTERS();
if (V8_UNLIKELY(FLAG_runtime_stats)) {
return Stats_Runtime_InterpreterNewClosure(args_length, args_object, isolate);
}
Arguments args(args_length, args_object);
return __RT_impl_Runtime_InterpreterNewClosure(args, isolate);
}
static Object* __RT_impl_Runtime_InterpreterNewClosure(Arguments args, Isolate* isolate)
HandleScope scope(isolate);
DCHECK_EQ(4, args.length());
CONVERT_ARG_HANDLE_CHECKED(SharedFunctionInfo, shared, 0);
CONVERT_ARG_HANDLE_CHECKED(FeedbackVector, vector, 1);
CONVERT_SMI_ARG_CHECKED(index, 2);
CONVERT_SMI_ARG_CHECKED(pretenured_flag, 3);
Handle<Context> context(isolate->context(), isolate);
FeedbackSlot slot = FeedbackVector::ToSlot(index);
Handle<Cell> vector_cell(Cell::cast(vector->Get(slot)), isolate);
return *isolate->factory()->NewFunctionFromSharedFunctionInfo(
shared, context, vector_cell,
static_cast<PretenureFlag>(pretenured_flag));
}
Notice that Runtime_InterpreterNewClosure
is called.
NewFunctionFromSharedFunctionInfo will call NewFunction which will create a new function:
function->initialize_properties();
function->initialize_elements();
function->set_shared(*info);
function->set_code(info->code());
function->set_context(*context_or_undefined);
function->set_prototype_or_initial_map(*the_hole_value());
function->set_feedback_vector_cell(*undefined_cell());
isolate()->heap()->InitializeJSObjectBody(*function, *map, JSFunction::kSize);
return function;
Notice the call function->set_code(info->code())
. This is the assembly code for InterpreterEntryTrampoline
.
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv,
147 argc, argv);
So we are back here. Value is of type Function and is the function for node_bootstrap.js. I'm confused as I thought that the above call
to CALL_GENEREATED_CODE
would run the script. The actuall calling of the functions is done in :
auto ret = f->Call(env->context(), Null(env->isolate()), 1, &arg);
(lldb) dis -f node`v8::internal::Runtime_InterpreterNewClosure:
Handle<JSFunction> Factory::NewFunctionFromSharedFunctionInfo(
Handle<Map> initial_map, Handle<SharedFunctionInfo> info,
Handle<Object> context_or_undefined, Handle<Cell> vector,
PretenureFlag pretenure) {
Handle<JSFunction> result =
NewFunction(initial_map, info, context_or_undefined, pretenure);
(lldb) job *info
0x2e6ef602e159: [SharedFunctionInfo] in OldSpace
- name = 0x2e6ea0e02441 <String[0]: >
- kind = [ NormalFunction ]
- function_map_index = 129
- formal_parameter_count = 1
- expected_nof_properties = 10
- language_mode = strict
- instance class name = #Object
- code = 0x20819e1c4281 <Code BUILTIN>
- bytecode_array = 0x2e6ef6030501
- source code = (process) {
let internalBinding;
const exceptionHandlerState = { captureFn: null };
function startup() {
const EventEmitter = NativeModule.require('events');
const origProcProto = Object.getPrototypeOf(process);
Object.setPrototypeOf(origProcProto, EventEmitter.prototype);
EventEmitter.call(process);
setupProcessObject();
...
startup();
}
- anonymous expression
- function token position = 300
- start position = 308
- end position = 21943
- no debug info
- length = 1
- feedback_metadata = 0x2e6ef6030759: [FeedbackMetadata] in OldSpace
- length: 15
- slot_count: 83
So we can see this is indeed bootstrap.js
Handle<JSFunction> Factory::NewFunction(Handle<Map> map,
Handle<SharedFunctionInfo> info,
Handle<Object> context_or_undefined,
PretenureFlag pretenure) {
AllocationSpace space = pretenure == TENURED ? OLD_SPACE : NEW_SPACE;
Handle<JSFunction> function = New<JSFunction>(map, space);
DCHECK(context_or_undefined->IsContext() ||
context_or_undefined->IsUndefined(isolate()));
function->initialize_properties();
function->initialize_elements();
function->set_shared(*info);
function->set_code(info->code());
function->set_context(*context_or_undefined);
function->set_prototype_or_initial_map(*the_hole_value());
function->set_feedback_vector_cell(*undefined_cell());
isolate()->heap()->InitializeJSObjectBody(*function, *map, JSFunction::kSize);
return function;
}
Lets now take a look at info->code()
(lldb) job info->code()
0x20819e1c4281: [Code]
kind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1133)
0x20819e1c42e0 0 488b5f2f REX.W movq rbx,[rdi+0x2f]
0x20819e1c42e4 4 488b5b07 REX.W movq rbx,[rbx+0x7]
0x20819e1c42e8 8 488b4b0f REX.W movq rcx,[rbx+0xf]
0x20819e1c42ec c f6c101 testb rcx,0x1
0x20819e1c42ef f 0f8512020000 jnz 0x20819e1c4507 (InterpreterEntryTrampoline)
0x20819e1c42f5 15 f6c101 testb rcx,0x1
0x20819e1c42f8 18 7410 jz 0x20819e1c430a (InterpreterEntryTrampoline)
...
Setting through again and we will be back in:
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv,
argc, argv);
(lldb) job value
0x2e6e9e80ea49: [Function]
- map = 0x2e6e8f082521 [FastProperties]
- prototype = 0x2e6ef60043d1
- elements = 0x2e6ea0e02251 <FixedArray[0]> [HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0x2e6ef602e159 <SharedFunctionInfo>
- name = 0x2e6ea0e02441 <String[0]: >
- formal_parameter_count = 1
- kind = [ NormalFunction ]
- context = 0x2e6ef6003af9 <FixedArray[281]>
- code = 0x20819e1c4281 <Code BUILTIN>
- interpreted
- bytecode = 0x2e6ef6030501
- source code = (process) {
let internalBinding;
...
}
- properties = 0x2e6ea0e02251 <FixedArray[0]> {
#length: 0x2e6ea9c034f9 <AccessorInfo> (const accessor descriptor)
#name: 0x2e6ea9c03569 <AccessorInfo> (const accessor descriptor)
#prototype: 0x2e6ea9c035d9 <AccessorInfo> (const accessor descriptor)
}
- feedback vector: 0x2e6ef60308f1: [FeedbackVector] in OldSpace
- length: 83
SharedFunctionInfo: 0x2e6ef602e159 <SharedFunctionInfo>
Optimized Code: 0
Invocation Count: 0
Profiler Ticks: 0
...
This will then return to node.cc:
Local<Value> result = script.ToLocalChecked()->Run();
if (result.IsEmpty()) {
ReportException(env, try_catch);
exit(4);
}
So the generated code will be entered. When it gets around to processing:
const EventEmitter = NativeModule.require('events');
This will result in another call to Invoke and the contents of func will be lib/event.js.
Code
Is a class in src/objects.h
which represents generated machine code.
DECL_PRINTER(Code)
This macro looks like this:
#ifdef OBJECT_PRINT
#define DECL_PRINTER(Name) void Name##Print(std::ostream& os); // NOLINT
#else
#define DECL_PRINTER(Name)
#endif
So if OBJECT_PRINT is defined there will be a function named:
void CodePrint(std::ostream& os);
Anything that extends v8::internal::Object can also have a Print
function:
(lldb) expr recv->Print()
0x343ceb998159: [JS_API_OBJECT_TYPE]
- map = 0x343c184c7d01 [FastProperties]
- prototype = 0x343cbcce6ee9
- elements = 0x343cad402251 <FixedArray[0]> [HOLEY_ELEMENTS]
- embedder fields: 1
- properties = 0x343ceb99a599 <PropertyArray[3]> {
#close: 0x343c78ce8bd1 <JSFunction JSStreamWrap.handle.close (sfi = 0x343c78ce83f1)> (const data descriptor)
#isClosing: 0x343cbccf8271 <JSFunction isClosing (sfi = 0x343c78ca3a49)> (const data descriptor)
#onreadstart: 0x343cbccf82b1 <JSFunction onreadstart (sfi = 0x343c78ca3af9)> (const data descriptor)
#onreadstop: 0x343cbccf82f1 <JSFunction onreadstop (sfi = 0x343c78ca3ba9)> (const data descriptor)
#onshutdown: 0x343cbccf8331 <JSFunction onshutdown (sfi = 0x343c78ca3c59)> (const data descriptor)
#onwrite: 0x343cbccf8371 <JSFunction onwrite (sfi = 0x343c78ca3d09)> (const data descriptor)
#owner: 0x343ceb998719 <Socket map = 0x343c184c7cb1> (data field 0) properties[0]
#onread: 0x343cbccb7619 <JSFunction onread (sfi = 0x343cb163e541)> (const data descriptor)
#reading: 0x343cad402381 <true> (data field 1) properties[1]
}
- embedder fields = {
0x105d053f0
}
CHECK
These macros Can be found in src/util.h.
#define LIKELY(expr) __builtin_expect(!!(expr), 1)
#define UNLIKELY(expr) __builtin_expect(!!(expr), 0)
#define CHECK(expr) \
do { \
if (UNLIKELY(!(expr))) { \
static const char* const args[] = { __FILE__, STRINGIFY(__LINE__), \
#expr, PRETTY_FUNCTION_NAME }; \
node::Assert(&args); \
} \
} while (0)
#define CHECK_EQ(a, b) CHECK((a) == (b))
So take the following expression:
CHECK_EQ(false, try_catch.IsVerbose());
it would expand to:
if (__builtin_expect(!!(false == try_catch.IsVerbose()), 0)) {
static const char* const args[] = { __FILE__, STRINGIFY(__LINE__), #expr, __PRETTY_FUNCTION_NAME__ };
node::Assert(args);
}
__builtin_expect
expects its parameters to be of type long and not bool, so there is a need to cast.
(lldb) expr !!(false == try_catch.IsVerbose())
(bool) $122 = true
This is already bool but the macro can be used with other types.
OutputStackCheck();
OutputStackCheck
is generated using a macro:
#define DEFINE_BYTECODE_OUTPUT(name, ...) \
template <typename... Operands> \
BytecodeNode BytecodeArrayBuilder::Create##name##Node( \
Operands... operands) { \
return BytecodeNodeBuilder<Bytecode::k##name, __VA_ARGS__>::Make( \
this, operands...); \
} \
\
template <typename... Operands> \
void BytecodeArrayBuilder::Output##name(Operands... operands) { \
BytecodeNode node(Create##name##Node(operands...)); \
Write(&node); \
} \
\
template <typename... Operands> \
void BytecodeArrayBuilder::Output##name(BytecodeLabel* label, \
Operands... operands) { \
DCHECK(Bytecodes::IsJump(Bytecode::k##name)); \
BytecodeNode node(Create##name##Node(operands...)); \
WriteJump(&node, label); \
LeaveBasicBlock(); \
}
BYTECODE_LIST(DEFINE_BYTECODE_OUTPUT)
#undef DEFINE_BYTECODE_OUTPUT
BYTECODE_LIST
is a macro defined in src/interpreter/bytecodes.h
:
// The list of bytecodes which are interpreted by the interpreter.
// Format is V(<bytecode>, <accumulator_use>, <operands>).
#define BYTECODE_LIST(V)
...
V(StackCheck, AccumulatorUse::kNone) \
template <typename... Operands>
BytecodeNode BytecodeArrayBuilder::CreateStackCheckNode(Operands... operands) {
return BytecodeNodeBuilder<Bytecode::kStackCheck, __VA_ARGS__>::Make(this, operands...);
}
template <typename... Operands>
void BytecodeArrayBuilder::OutputStackCheck(Operands... operands) {
BytecodeNode node(CreateStackCheckNode(operands...));
Write(&node);
}
template <typename... Operands>
void BytecodeArrayBuilder::OutputStackCheck(BytecodeLabel* label,
Operands... operands) {
DCHECK(Bytecodes::IsJump(Bytecode::kStackCheck));
BytecodeNode node(CreateStackCheckNode(operands...));
WriteJump(&node, label);
LeaveBasicBlock();
}
Our call does not have any parameters so the first OutputStackCheck
will be called in this case which will create
the BytecodeNode and then call Write(&node)
(lldb) p *node
(v8::internal::interpreter::BytecodeNode) $218 = {
bytecode_ = kStackCheck
operands_ = ([0] = 0, [1] = 0, [2] = 0, [3] = 0, [4] = 0)
operand_count_ = 0
operand_scale_ = kSingle
source_info_ = (position_type_ = kExpression, source_position_ = 0)
}
compiler.cc:803
Handle<SharedFunctionInfo> shared_info =
801 isolate->factory()->NewSharedFunctionInfoForLiteral(parse_info->literal(),
802 parse_info->script());
(lldb) job *shared_info
I'm not showing the whole output but this is the contents of node_bootstrap.js
interpreter.cc:211
Handle<BytecodeArray> bytecodes =
211 generator()->FinalizeBytecode(isolate(), parse_info()->script());
-> 225 compilation_info()->SetBytecodeArray(bytecodes);
226 compilation_info()->SetCode(
227 BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));
228 return SUCCEEDED;
So BUILTIN_CODE
will expand to:
isolate->builtins()->builtin_handle(Builtins::kInterpreterEntryTrampoline);
So that line will become:
compilation_info()->SetCode(isolate->builtins()->builtin_handle(Builtins::kInterpreterEntryTrampoline));
To recap, builtins/builtins.h' has a macro that creates an enum entry for all of the builtins listed in
src/builtins/builtins-definitions.h`. And we are then calling builtin_handle with that enum value:
V8_EXPORT_PRIVATE Handle<Code> builtin_handle(int index);
So what does this function do?
src/builtins/builtins.cc
Handle<Code> Builtins::builtin_handle(int index) {
DCHECK(IsBuiltinId(index));
return Handle<Code>(reinterpret_cast<Code**>(builtin_address(index)));
}
How builtins get initialized
But how does the builtins_ array get?
src/builtins/builtins.h
:
BUILTIN_LIST(IGNORE_BUILTIN, IGNORE_BUILTIN, DECLARE_TF, DECLARE_TF,
DECLARE_TF, DECLARE_TF, DECLARE_ASM)
Where DECLARE_ASM
is defined as:
#define DECLARE_ASM(Name, ...) \
static void Generate_##Name(MacroAssembler* masm);
So there will be a function named Generate_InterpreterEntryTrampoline.
But we want to know where the builtins_ array is populated. Well, it
Object* builtins_[builtin_count];
And builtins_count
is a member of the Name enum:
enum Name : int32_t {
#define DEF_ENUM(Name, ...) k##Name,
BUILTIN_LIST_ALL(DEF_ENUM)
#undef DEF_ENUM
builtin_count
};
It seems like these entries will get populated when the snapshot is deserialized: (snapshot/builtin-deserializer.cc)
builtins->set_builtin(i, DeserializeBuiltin(i));
Lets take a look at the stack when running generated code from V8. In src/globals.h we find:
typedef byte* Address;
...
#define FOR_EACH_ISOLATE_ADDRESS_NAME(C) \
C(Handler, handler) \
C(CEntryFP, c_entry_fp) \
C(CFunction, c_function) \
C(Context, context) \
C(PendingException, pending_exception) \
C(PendingHandlerContext, pending_handler_context) \
C(PendingHandlerCode, pending_handler_code) \
C(PendingHandlerOffset, pending_handler_offset) \
C(PendingHandlerFP, pending_handler_fp) \
C(PendingHandlerSP, pending_handler_sp) \
C(ExternalCaughtException, external_caught_exception) \
C(JSEntrySP, js_entry_sp)
enum IsolateAddressId {
#define DECLARE_ENUM(CamelName, hacker_name) k##CamelName##Address,
FOR_EACH_ISOLATE_ADDRESS_NAME(DECLARE_ENUM)
#undef DECLARE_ENUM
kIsolateAddressCount
};
Will expand to:
enum IsolateAddressId {
kHandlerAddress,
...
};
Notice that hacker_name
parameter is not used in this call.
In isolate.cc when an Isolate is initialized by bool Isolate::Init(StartupDeserializer* des)
:
#define ASSIGN_ELEMENT(CamelName, hacker_name) \
isolate_addresses_[IsolateAddressId::k##CamelName##Address] = \
reinterpret_cast<Address>(hacker_name##_address());
FOR_EACH_ISOLATE_ADDRESS_NAME(ASSIGN_ELEMENT)
#undef ASSIGN_ELEMENT
isolate_addresses_[IsolateAddressId::HandlerAddress] = reinterpret_cast<Address>(handler_address());
isolate_addresses_
can be found in isolate.h:
Address isolate_addresses_[kIsolateAddressCount + 1];
So where does handler_address()
come from?
This is defined in isolate.h
:
inline Address* c_entry_fp_address() { return &thread_local_top_.c_entry_fp_; }
inline Address* handler_address() { return &thread_local_top_.handler_; }
inline Address* js_entry_sp_address() { return &thread_local_top_.js_entry_sp_; }
inline Address* c_function_address() { return &thread_local_top_.c_function_; }
Context** context_address() { return &thread_local_top_.context_; }
Address pending_message_obj_address() { return reinterpret_cast<Address>(&thread_local_top_.pending_message_obj_); }
THREAD_LOCAL_TOP_ADDRESS(Context*, pending_handler_context)
THREAD_LOCAL_TOP_ADDRESS(Code*, pending_handler_code)
THREAD_LOCAL_TOP_ADDRESS(intptr_t, pending_handler_offset)
THREAD_LOCAL_TOP_ADDRESS(Address, pending_handler_fp)
THREAD_LOCAL_TOP_ADDRESS(Address, pending_handler_sp)
#define THREAD_LOCAL_TOP_ADDRESS(type, name) \
type* name##_address() { return &thread_local_top_.name##_; }
Context* pending_handler_context_address() { return &thread_local_top_.pending_handler_context_;
When is Isolate::Init
called?
It is called from node::Start
Isolate* const isolate = Isolate::New(params);
This will invoke return IsolateNewImpl(isolate, params);
(lldb) expr target->Print()
(lldb) expr v8::internal::JSFunction::cast(*target)
(v8::internal::JSFunction *) $1115 = 0x000017771f4b05f1
(lldb) expr v8::internal::JSFunction::cast(*target)->code()
(v8::internal::Code *) $1116 = 0x0000262dec344281
(lldb) job v8::internal::JSFunction::cast(*target)->code()
So that target functions byte code is of type InterpreterEntryTrampoline
JSEntryStub
So stub_entry is from x64/code-stubs-x64.ccby
void JSEntryStub::Generate`:
(lldb) p stub_entry
(JSEntryFunction) $1126 = 0x0000262dec284060
(lldb) job *code 0x262dec284001: [Code] kind = STUB major_key = JSEntryStub compiler = unknown Instructions (size = 232) 0x262dec284060 0 55 push rbp 0x262dec284061 1 4889e5 REX.W movq rbp,rsp 0x262dec284064 4 6a02 push 0x2
Notice that 0x0000262dec284060
from stub_entry
matches the first instruction in code.
0x262dec2840ed 8d 49ba088c800501000000 REX.W movq r10,0x105808c08 ;; external reference (Isolate::handler_address) 0x262dec2840f7 97 498922 REX.W movq [r10],rsp 0x262dec2840fa 9a e8c1d70b00 call 0x262dec3418c0 (JSEntryTrampoline) ;; code: BUILTIN
(lldb) dis -s 0x262dec3418c0 0x262dec3418c0: movq %rdi, %r11 0x262dec3418c3: movq %rsi, %rdi 0x262dec3418c6: xorl %esi, %esi 0x262dec3418c8: pushq %rbp 0x262dec3418c9: movq %rsp, %rbp 0x262dec3418cc: pushq $0x1c 0x262dec3418ce: movabsq $0x262dec341861, %r10 ; imm = 0x262DEC341861 0x262dec3418d8: pushq %r10
Runtime_CompileLazy
There is a V8 builtin named CompileLazy which when called can compile the function being called (runtime/runtime-compiler.cc): (See RUNTIME_FUNCTION for details regarding the RUNTIME_FUNCTION macro)
RUNTIME_FUNCTION(Runtime_CompileLazy) {
...
if (!Compiler::Compile(function, Compiler::KEEP_EXCEPTION)) {
return isolate->heap()->exception();
}
DCHECK(function->is_compiled());
return function->code();
First time, what is compiled is the entire node_bootstrap.js: InterpreterCompilationJob::Status InterpreterCompilationJob::FinalizeJobImpl() (deps/v8/src/interpreter/interpreter.cc)
(lldb) br s -f interpreter.cc -l 201
Handle<BytecodeArray> bytecodes = generator()->FinalizeBytecode(isolate(), parse_info()->script());
(lldb) job *bytecodes
0x18d63b2dfe1: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
0 E> 0x18d63b2e01a @ 0 : 93 StackCheck
299 S> 0x18d63b2e01b @ 1 : 6f 00 00 00 CreateClosure [0], [0], #0
0x18d63b2e01f @ 5 : 1e fb Star r0
21946 S> 0x18d63b2e021 @ 7 : 97 Return
Constant pool (size = 1)
0x18d63b2dfc9: [FixedArray] in OldSpace
- map = 0x18ddb3022f1 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x18d63b2df19 <SharedFunctionInfo>
Handler Table (size = 16)
Notice that the byte code for this is pretty short. If you look at node_bootstrap.js you see that it is a function. I think this is why the CreateClosure operator is for.
(lldb) job *shared_info
0x1606df4adc91: [SharedFunctionInfo] in OldSpace
- name = 0x160691002441 <String[0]: >
- kind = [ NormalFunction ]
- function_map_index = 129
- formal_parameter_count = 0
- expected_nof_properties = 10
- language_mode = strict
- instance class name = #Object
- code = 0x2066fa144281 <Code BUILTIN>
- bytecode_array = 0x1606df4adfe1
- source code = // Hello, and welcome to hacking node.js!
//
// This file is invoked by node::LoadEnvironment in src/node.cc, and is
// responsible for bootstrapping the node.js core. As special caution is given
// to the performance of the startup process, many dependencies are invoked
// lazily.
Notice that source code
contains the entire content of bootstrap_node.js.
If you later look at 0x18d63b2df19
(from bytecodes above) we can see that this is the inner function:
(lldb) job *inner_shared_info
0x1606df4adf19: [SharedFunctionInfo] in OldSpace
- name = 0x160691002441 <String[0]: >
- kind = [ NormalFunction ]
- function_map_index = 129
- formal_parameter_count = 1
- expected_nof_properties = 10
- language_mode = strict
- instance class name = #Object
- code = 0x2066fa1450e1 <Code BUILTIN>
- source code = (process) {
let internalBinding;
const exceptionHandlerState = { captureFn: null };
function startup() {
And notice that this in now the anonymous function that takes the process object. This is what will be called later.
Next, we have:
compilation_info()->SetBytecodeArray(bytecodes);
compilation_info()->SetCode(BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));
InstallUnoptimizedCode will later use this Code and set that as on the SharedFunctionInfo:
shared->set_code(*compilation_info->code());
The inner_shared_info will also be compiled by FinalizeUnoptimizedCode
and
the bytecode for it will much longer as expected:
(lldb) job *bytecodes
Next this is compiled by FinalizeJobImpl
:
compilation_info()->SetBytecodeArray(bytecodes);
compilation_info()->SetCode(BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));
Again notice that the code is set to InterpreterEntryTrampoline
.
After having done the function will return success.
This will now return and we will end up back in node.cc:
Local<Value> result = script.ToLocalChecked()->Run();
This will land us in v8::Script::Run
and the following line:
auto fun = i::Handle<i::JSFunction>::cast(Utils::OpenHandle(this));
Now if we print this JSFunction using job *fun
we can see the complete object including the source.
We can also take a look at the code and the bytecode:
(lldb) job fun->abstract_code()
0x1606df4adfe1: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
0 E> 0x1606df4ae01a @ 0 : 93 StackCheck
299 S> 0x1606df4ae01b @ 1 : 6f 00 00 00 CreateClosure [0], [0], #0
0x1606df4ae01f @ 5 : 1e fb Star r0
21946 S> 0x1606df4ae021 @ 7 : 97 Return
Constant pool (size = 1)
0x1606df4adfc9: [FixedArray] in OldSpace
- map = 0x1606023022f1 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x1606df4adf19 <SharedFunctionInfo>
Handler Table (size = 16)
And a snipped of the code:
(lldb) job fun->code()
0x2066fa144281: [Code]
kind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x2066fa1442e0 0 488b5f2f REX.W movq rbx,[rdi+0x2f]
0x2066fa1442e4 4 488b5b07 REX.W movq rbx,[rbx+0x7]
0x2066fa1442e8 8 488b4b0f REX.W movq rcx,[rbx+0xf]
0x2066fa1442ec c f6c101 testb rcx,0x1
0x2066fa1442ef f 0f8591010000 jnz 0x2066fa144486 (InterpreterEntryTrampoline)
0x2066fa1442f5 15 f6c101 testb rcx,0x1
Once again we will be back in Invoke
and this following line:
Handle<Code> code = is_construct ? isolate->factory()->js_construct_entry_code() : isolate->factory()->js_entry_code();
In this case is_construct
is false so Handle<Code> will be what ever isolate->factory()->js_entry_code()
returns. And it
returns:
(lldb) job *code
0x2066fa084001: [Code]
kind = STUB
major_key = JSEntryStub
Lets take a closer look at the call isolate->factory()->js_entry_code()
. In src/factory.h we can find:
#define ROOT_ACCESSOR(type, name, camel_name) inline Handle<type> name();
ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR
And we can find ROOT_LIST
in src/heap/heap.h
:
#define ROOT_LIST(V) \
STRONG_ROOT_LIST(V) \
SMI_ROOT_LIST(V) \
V(StringTable, string_table, StringTable)
STRONG_ROOT_LIST(V)
contains (among others):
/* JS Entries */ \
V(Code, js_entry_code, JsEntryCode) \
V(Code, js_construct_entry_code, JsConstructEntryCode)
And in factory-inl.h
we have:
#define ROOT_ACCESSOR(type, name, camel_name) \
Handle<type> Factory::name() { \
return Handle<type>(bit_cast<type**>( \
&isolate()->heap()->roots_[Heap::k##camel_name##RootIndex])); \
}
ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR
So, for js_entry_code
the following would be expanded by the preprocessor:
Handle<JsEntryCode> Factory::js_entry_code() {
return Handle<JsEntryCode>(bit_cast<JsEntryCode**>(&isolate()->heap()->roots_[Heap::kJsEntryCodeRootIndex]));
}
Ok so back to the Invoke
function:
JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
...
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
Now, we know that stub_entry
is the JSEntryStub
(src/x64/code-stubs-x64.cc) and func
is the InterpreterEntryTrampoline
(src/builtins/x64/builtins-x64.cc).
-> 0xd099dd84060: pushq %rbp // function prologue
0xd099dd84061: movq %rsp, %rbp // function prologue Stack
0xd099dd84064: pushq $0x2 // the stack frame marker type 2 (Marker Type ConstructEntryFrame)
0xd099dd84066: movabsq $0x105808b90, %r10 // mov the value of the current context address
0xd099dd84070: movq (%r10), %r10 // dereference so the address is in r10
0xd099dd84073: pushq %r10 // push the context address onto the stack address to current context
0xd099dd84075: pushq %r12 // store callee saved registers which need to be preserved whatever was in r12
0xd099dd84077: pushq %r13 // as above whatever was in r13
0xd099dd84079: pushq %r14 // as above whatever was in r14
0xd099dd8407b: pushq %r15 // as above whatever was in r15
0xd099dd8407d: pushq %rbx // as above whatever was in rbx
0xd099dd8407e: movabsq $0x105807248, %r13 // InitializeRootRegister (x64/macro-assembler-x64.h)
0xd099dd84088: addq $0x80, %r13 // also part of InitializeRootRegister
0x16e271f840f: movabsq $0x105808c00, %r10 // mov the frame pointer address into r10
0x16e271f8409: pushq (%r10) // dereferense so that the address on the stack address to frame pointer descriptor
0x16e271f840C: movabsq 0x105808c20, %rax // mov the js_entry_sp (stack pointer) into rax
0x16e271f840a6: testq %rax, %rax // test
0x16e271f840a9: jne 0x16e271f840c3 // and jump if zero flag is 0 (rax is not all zeros) which is not the case
0xd099dd840af: pushq $0x2 // StackFrame::OUTERMOST_JSENTRY_FRAME))
0xd099dd840b1: movq %rbp, %rax // move the current frame base pointer to rax
0xd099dd840b4: movabsq %rax, 0x105808c20 // move the base frame pointer to memory location
0xd099dd840be: jmp 0xd099dd840c5 // unconditional jump (jmp &cont)
0xd099dd840c5: jmp 0xd099dd840e0 // unconditional jump (jmp &invok)
0xd099dd840e0: movabsq $0x105808c08, %r10 // move handler_address to r10 (PushStackHandler src/x64/macro-assembler-x64.cc)
0xd099dd840ea: pushq (%r10) // push the dereferenced address of handler_address onto the stack address to handler
0xd099dd840ed: movabsq $0x105808c08, %r10 // move the address again. not sure why this is done again through
0xd099dd840f7: movq %rsp, (%r10) // set the stack pointer to be that of the handler. This will be the I
0xd099dd840fa: callq 0xd099de418c0 // __ Call(BUILTIN_CODE(isolate(), JSEntryTrampoline), RelocInfo::CODE_TARGET);
`Generate_JSEntryTrampolineHelper(MacroAssembler* masm, bool is_construct)` in `src/builtins/x64/builtins-x64.cc`:
0x16e2720418c0: movq %rdi, %r11 // mov new_target into r11
0x16e2720418c3: movq %rsi, %rdi // move func into rdi
0x16e2720418c6: xorl %esi, %esi // __ Set(rsi, 0);
0x16e2720418c8: pushq %rbp // function prologue previous base frame pointer
0x16e2720418c9: movq %rsp, %rbp // function prologue
0x16e2720418cc: pushq $0x1c // the stack frame marker type 1c (28)
0x16e2720418ce: movabsq $0x16e272041861, %r10 // move the CodeObject address into r10??
0x16e2720418d8: pushq %r10 // push the CodeObject onto the stack code object
0x16e2720418da: movabsq $0x37b3a2c022e1, %r10 // emit_debug_code
0x16e2720418e4: cmpq %r10, (%rsp) // emit_debug_code
0x16e2720418e8: jne 0x16e2720418fa // emit_debug_code
0x16e2720418fa: movabsq $0x105808b90, %r10
0x16e272041904: movq (%r10), %rsi
0x16e272041907: pushq %rdi function
0x16e272041908: pushq %rdx receiver
0x16e272041909: movq %rcx, %rax
0x16e27204190c: movq %r8, %rbx
0x16e27204190f: movq %r11, %rdx
Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt):
0x16e272041912: movq 0xd08(%r13), %r10 // LoadRoot (x64/macro-assembler-x64.cc) (index << kPointerSizeLog2) - kRootRegisterBias))
0x16e272041919: movq %rsp, %rcx //
0x16e27204191c: subq %r10, %rcx
0x16e27204191f: movq %rax, %r11
0x16e272041922: shlq $0x3, %r11
0x16e272041926: cmpq %r11, %rcx
0x16e272041929: jg 0x16e272041940
0x16e272041940: xorl %ecx, %ecx // __ Set(rcx, 0); Start loop to copy args onto the stack
0x16e272041942: jmp 0x16e27204194f
0x16e27204194f: cmpq %rax, %rcx // __ cmpp(rcx, rax);
0x16e272041954: callq 0x16e27203c400 // __ Call(masm->isolate()->builtins()->Call(), RelocInfo::CODE_TARGET)
The last `Call` I think will land in (x64/macro-assembler-x64.cc line 2003):
```c++
void TurboAssembler::Call(Handle<Code> code_object, RelocInfo::Mode rmode) {
call(code_object, rmode);
#endif
}
call
is declared in src/x64/assembler-x64.h
:
void call(Handle<Code> target, RelocInfo::Mode rmode = RelocInfo::CODE_TARGET);
The definition can be found in src/x64/assembler-x64.cc
:
void Assembler::call(Address entry, RelocInfo::Mode rmode) {
DCHECK(RelocInfo::IsRuntimeEntry(rmode));
EnsureSpace ensure_space(this);
// 1110 1000 #32-bit disp.
emit(0xE8);
emit_runtime_entry(entry, rmode);
}
If we look again at the callq:
0x16e272041954: callq 0x16e27203c400
And display the opcode for callq:
(lldb) dis -f -b
-> 0x16e272041954: e8 a7 aa ff ff callq 0x16e27203c400
We can see match the opcode emitted by emit(0xE8)
to e8. But what is returned by the call to
masm->isolate()-builtins()->call()`?
In src/isolate.h we have:
Builtins* builtins() { return &builtins_; }
If I back up 2 frames and try to execute isolate()->builtins()->Call() I get:
(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
In src/builtins/builtins.h
we can find the Call
function with has a default value for the mode
parameter:
Handle<Code> Call(ConvertReceiverMode = ConvertReceiverMode::kAny);
If we take a look in src/builtins/builtins-call.cc
we find the definition of Builtins::Call
:
Handle<Code> Builtins::Call(ConvertReceiverMode mode) {
switch (mode) {
case ConvertReceiverMode::kNullOrUndefined:
return builtin_handle(kCall_ReceiverIsNullOrUndefined);
case ConvertReceiverMode::kNotNullOrUndefined:
return builtin_handle(kCall_ReceiverIsNotNullOrUndefined);
case ConvertReceiverMode::kAny:
return builtin_handle(kCall_ReceiverIsAny);
}
UNREACHABLE();
}
And if we look in src/builtins/builtins-definitions.h we can find Call_ReceiverIsAny`:
ASM(Call_ReceiverIsAny)
And we should be able to use ConvertReceiverMode::kAny
with the Call
function to find out
what is going to be called (the code_object in call(code_object, rmode);
:
(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
0x16e27203c3a1: [Code]
kind = BUILTIN
name = Call_ReceiverIsAny
compiler = unknown
Instructions (size = 178)
0x16e27203c400 0 40f6c701 testb rdi,0x1
0x16e27203c404 4 0f8446000000 jz 0x16e27203c450 (Call_ReceiverIsAny)
0x16e27203c40a a 488b4fff REX.W movq rcx,[rdi-0x1]
0x16e27203c40e e 80790bff cmpb [rcx+0xb],0xff
0x16e27203c412 12 0f8408faffff jz 0x16e27203be20 (CallFunction_ReceiverIsAny) ;; code: BUILTIN
0x16e27203c418 18 80790bfe cmpb [rcx+0xb],0xfe
0x16e27203c41c 1c 0f845efcffff jz 0x16e27203c080 (CallBoundFunction) ;; code: BUILTIN
0x16e27203c422 22 f6410c02 testb [rcx+0xc],0x2
0x16e27203c426 26 0f8424000000 jz 0x16e27203c450 (Call_ReceiverIsAny)
0x16e27203c42c 2c 80790bb7 cmpb [rcx+0xb],0xb7
0x16e27203c430 30 0f8505000000 jnz 0x16e27203c43b (Call_ReceiverIsAny)
0x16e27203c436 36 e9e5000000 jmp 0x16e27203c520 (CallProxy) ;; code: BUILTIN
0x16e27203c43b 3b 48897cc408 REX.W movq [rsp+rax*8+0x8],rdi
0x16e27203c440 40 488b7e27 REX.W movq rdi,[rsi+0x27]
0x16e27203c444 44 488bbfff000000 REX.W movq rdi,[rdi+0xff]
0x16e27203c44b 4b e970f7ffff jmp 0x16e27203bbc0 (CallFunction_ReceiverIsNotNullOrUndefined) ;; code: BUILTIN
0x16e27203c450 50 55 push rbp
0x16e27203c451 51 4889e5 REX.W movq rbp,rsp
0x16e27203c454 54 6a1c push 0x1c
0x16e27203c456 56 49baa1c30372e2160000 REX.W movq r10,0x16e27203c3a1 (Call_ReceiverIsAny) ;; object: 0x16e27203c3a1 <Code BUILTIN>
0x16e27203c460 60 4152 push r10
0x16e27203c462 62 49bae122c0a2b3370000 REX.W movq r10,0x37b3a2c022e1 ;; object: 0x37b3a2c022e1 <undefined>
0x16e27203c46c 6c 4c391424 REX.W cmpq [rsp],r10
0x16e27203c470 70 7510 jnz 0x16e27203c482 (Call_ReceiverIsAny)
0x16e27203c472 72 48ba0000000009000000 REX.W movq rdx,0x900000000
0x16e27203c47c 7c e83f950200 call 0x16e2720659c0 (Abort) ;; code: BUILTIN
0x16e27203c481 81 cc int3l
0x16e27203c482 82 57 push rdi
0x16e27203c483 83 b801000000 movl rax,0x1
0x16e27203c488 88 48bb9002430101000000 REX.W movq rbx,0x101430290 ;; external reference (Runtime::ThrowCalledNonCallable)
0x16e27203c492 92 e8a982f4ff call 0x16e271f84740 ;; code: STUB, CEntryStub, minor: 8
0x16e27203c497 97 48837df81c REX.W cmpq [rbp-0x8],0x1c
0x16e27203c49c 9c 7410 jz 0x16e27203c4ae (Call_ReceiverIsAny)
0x16e27203c49e 9e 48ba000000004f000000 REX.W movq rdx,0x4f00000000
0x16e27203c4a8 a8 e813950200 call 0x16e2720659c0 (Abort) ;; code: BUILTIN
0x16e27203c4ad ad cc int3l
0x16e27203c4ae ae 488be5 REX.W movq rsp,rbp
0x16e27203c4b1 b1 5d pop rbp
RelocInfo (size = 11)
0x16e27203c414 code target (BUILTIN) (0x16e27203be20)
0x16e27203c41e code target (BUILTIN) (0x16e27203c080)
0x16e27203c437 code target (BUILTIN) (0x16e27203c520)
0x16e27203c44c code target (BUILTIN) (0x16e27203bbc0)
0x16e27203c458 embedded object (0x16e27203c3a1 <Code BUILTIN>)
0x16e27203c464 embedded object (0x37b3a2c022e1 <undefined>)
0x16e27203c47d code target (BUILTIN) (0x16e2720659c0)
0x16e27203c48a external reference (Runtime::ThrowCalledNonCallable) (0x101430290)
0x16e27203c493 code target (STUB) (0x16e271f84740)
0x16e27203c4a9 code target (BUILTIN) (0x16e2720659c0)
To find out what generated this code we can look in src/builtins/builtins-call-gen.cc
:
void Builtins::Generate_Call_ReceiverIsAny(MacroAssembler* masm) {
Generate_Call(masm, ConvertReceiverMode::kAny);
}
And an implementation can be found in src/builtins/x64/builtins-x64.cc
:
void Builtins::Generate_Call(MacroAssembler* masm, ConvertReceiverMode mode) {
}
Now, the actual produced assembly code looks like this: (lldb) register read rax rdi rax = 0x0000000000000000 // number of args rdi = 0x000037b34d8b06d9 // the target to call
0x16e27203c400: testb $0x1, %dil // __ JumpIfSmi(rdi, &non_callable); which consists of CheckSmi. dil is the low 8 bits of rdi
0x16e27203c404: je 0x16e27203c450 // __ JumpIfSmi(rdi, &non_callable: which after CheckSmi will jump. rdi is the target and not a smi in our case
0x16e27203c40a: movq -0x1(%rdi), %rcx // __ CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx); which does a movp
0x16e27203c40e: cmpb $-0x1, 0xb(%rcx) // __ CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx); which does a cmpb
0x16e27203c412: je 0x16e27203be20 // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
// I was not sure where to find this `j` function but it is in src/x64/assembler-x64.cc
// so what are we jumping to? We can back up in the debugger and find out:
// (lldb) up 2
// (lldb job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// This will be a builtin named CallFunction_ReceiverIsAny, and being a builtin you'll find it in
// `src/builtins/builtins-definitions.h`. The implementation will be in `src/builtins/builtins-call.cc` and
// will result in `return builtin_handle(kCallFunction_ReceiverIsAny)`. This code repsonsible for generating
// is Builtins::Generate_CallFunction and in our case that means `src/builtins/x64/builtins-x64.cc`
Builtins::Generate_Call(MacroAssembler* masm, ConvertReceiverMode mode) (src/builtins/x64/builtins-x64.cc)
0x16e27203be20: testb $0x1, %dil // __ JumpIfSmi(rdi, &non_callable);
0x16e27203be24: jne 0x16e27203be36 // __ JumpIfSmi(rdi, &non_callable);
0x16e27203be36: pushq %rdi // __ Push(rdi);
__ CallRuntime(Runtime::kThrowCalledNonCallable); in `src/x64/macro-assembler-x64.h` and
...
0x16e27203be63: movq 0x27(%rdi), %rsi //__ movp(rsi, FieldOperand(rdi, JSFunction::kContextOffset));
0x16e27203be67: testb $0x3, 0x87(%rdx) // __ testl(FieldOperand(rdx, SharedFunctionInfo::kCompilerHintsOffset), Immediate(SharedFunctionInfo::IsNativeBit::kMask | SharedFunctionInfo::IsStrictBit::kMask));
0x16e27203be6e: jne 0x16e27203bf14 // __ j(not_zero, &done_convert);
// rax = nr args, rdx = SharedFunctionInfo, rdi = target function, rsi = the function context
0x16e27203bf14: movslq 0x73(%rdx), %rbx // __ movsxlq(rbx, FieldOperand(rdx, SharedFunctionInfo::kFormalParameterCountOffset));
0x16e27203bf18: movabsq $0x1050057f2, %r10
0x16e27203bf18: movabsq $0x1050057f2, %r10 // __ InvokeFunctionCode(rdi, no_reg, expected, actual, JUMP_FUNCTION);
0x16e27203bfa4: movq -0x60(%r13), %rdx // LoadRoot(rdx, Heap::kUndefinedValueRootIndex);
0x16e27203bfa8: cmpq %rax, %rbx // cmpp(expected.reg(), actual.reg());
0x16e27203bfb2: movq 0x37(%rdi), %rcx // movp(rcx, FieldOperand(function, JSFunction::kCodeOffset));
0x16e27203bfb6: addq $0x5f, %rcx // addp(rcx, Immediate(Code::kHeaderSize - kHeapObjectTag));
0x16e27203bfba: jmpq *%rcx // jmp(rcx);
Once again I got lost :(
I do know that if I set through I'll end up in RUNTIME_FUNCTION(Runtime_InterpreterNewClosure
in runtime/runtime-interpreter.cc
which has this line:
return *isolate->factory()->NewFunctionFromSharedFunctionInfo(shared, context, vector_cell,static_cast<PretenureFlag>(pretenured_flag));
This call will delegate to another NewFunctionFromSharedFunctionInfo
and:
Handle<JSFunction> result = NewFunction(initial_map, info, context_or_undefined, pretenure);
NewFunction
will
Handle<JSFunction> function = New<JSFunction>(map, space);
function->initialize_properties();
function->initialize_elements();
function->set_shared(*info);
function->set_code(info->code());
function->set_context(*context_or_undefined);
function->set_prototype_or_initial_map(*the_hole_value());
function->set_feedback_vector_cell(*undefined_cell());
code
will be the InterpreterEntryTrampoline
and it looks like it was already been compiled:
(lldb) expr info->is_compiled()
(bool) $483 = true
This Handle<JSFunction> is then returned RUNTIME_FUNCTION(Runtime_InterpreterNewClosure
:
0x101437967 <+263>: movq %rax, -0x8(%rbp) // store the value return value Handle<JSFunction> on the stack
0x10143796b <+267>: movq -0x8(%rbp), %rax // move it into rax which is the register for return value
0x10143796f <+271>: addq $0x50, %rsp // clean up local stack variables
0x101437973 <+275>: popq %rbp // pop the previous functions base frame pointer
0x101437974 <+276>: retq // transfer control to the return address on the stack (placed there by call)
0x101437975 <+277>: nopw %cs:(%rax,%rax)
(after retq):
0x16e271f847a4: cmpq 0x88(%r13), %rax
0x16e271f847ab: je 0x16e271f847f8
0x16e271f847b1: movq -0x58(%r13), %r14
0x16e271f847b5: movabsq $0x105808ba0, %r10
0x16e271f847bf: cmpq (%r10), %r14
0x16e271f847c2: je 0x16e271f847c5
0x16e271f847c5: movq 0x8(%rbp), %rcx
0x16e271f847c9: movq (%rbp), %rbp
0x16e271f847cd: leaq 0x8(%r15), %rsp
0x16e271f847d1: pushq %rcx
0x16e271f847d2: movabsq $0x105808b90, %r10
0x16e271f847dc: movq (%r10), %rsi
0x16e271f847df: movq $0x0, (%r10)
0x16e271f847e6: movabsq $0x105808c00, %r10 ; imm = 0x105808C00
0x16e271f847f0: movq $0x0, (%r10)
0x16e271f847f7: retq
0x16e271fd6cfd: movq %rax, %rdx
0x16e271fd6d00: movq %rax, -0x28(%rbp)
0x16e271fd6d04: movq %rsp, %rax
0x16e271fd6d07: cmpq -0x38(%rbp), %rax
0x16e271fd6d0b: je 0x16e271fd6d42
0x16e271fd6d42: movq -0x20(%rbp), %r12
0x16e271fd6d46: addq $0x4, %r12
0x16e271fd6d4a: movq -0x18(%rbp), %rdx
0x16e271fd6d4e: movq -0x18(%rdx), %r14
0x16e271fd6d52: movzbl (%r12,%r14), %eax
0x16e271fd6d57: movabsq $0x100000000, %r10 ; imm = 0x100000000
0x16e271fd6d61: cmpq %rax, %r10
0x16e271fd6d64: jae 0x16e271fd6d76
0x16e271fd6d76: movq -0x10(%rbp), %r15
0x16e271fd6d7a: movq (%r15,%rax,8), %rbx
0x16e271fd6d7e: movq (%rbp), %rbp
0x16e271fd6d82: movq 0x38(%rsp), %rax
0x16e271fd6d87: addq $0x68, %rsp
0x16e271fd6d8b: jmpq *%rbx
0x16e271fbdde0: movsbq 0x1(%r14,%r12), %rbx
0x16e271fbdde6: movq %rbp, %rdx
0x16e271fbdde9: movq %rax, (%rdx,%rbx,8)
0x16e271fbdded: addq $0x2, %r12
0x16e271fbddf1: movzbl (%r12,%r14), %ebx
0x16e271fbddf6: movabsq $0x100000000, %r10 ; imm = 0x100000000
0x16e271fbde00: cmpq %rbx, %r10
0x16e271fbde03: jae 0x16e271fbde15
0x16e271fbde15: movq (%r15,%rbx,8), %rbx
0x16e271fbde19: jmpq *%rbx
0x16e271fdca60: movq %rbp, %rbx
0x16e271fdca63: movl $0x0, -0x20(%rbx)
0x16e271fdca6a: movl %r12d, -0x1c(%rbx)
0x16e271fdca6e: movl %r12d, %edx
0x16e271fdca71: movabsq $0x100000000, %r10 ; imm = 0x100000000
0x16e271fdca7b: cmpq %rdx, %r10
0x16e271fdca7e: jae 0x16e271fdca90
0x16e271fdca90: subl $0x39, %edx
0x16e271fdca93: movabsq $0x100000000, %r10 ; imm = 0x100000000
0x16e271fdca9d: cmpq %rdx, %r10
0x16e271fdcaa0: jae 0x16e271fdcab2
0x16e271fdcab2: cmpl $0x0, %edx
0x16e271fdcab5: jl 0x16e271fdcb0b
0x16e271fdcab7: movl 0x33(%r14), %ecx
0x16e271fdcabb: movabsq $0x100000000, %r10 ; imm = 0x100000000
0x16e271fdcac5: cmpq %rcx, %r10
0x16e271fdcac8: jae 0x16e271fdcada
0x16e271fdcada: subl $0x1, %ecx
0x16e271fdcadd: movabsq $0x100000000, %r10 ; imm = 0x100000000
0x16e271fdcae7: cmpq %rcx, %r10
0x16e271fdcaea: jae 0x16e271fdcafc
0x16e271fdcafc: subl %edx, %ecx
0x16e271fdcafe: cmpl $0x0, %ecx
0x16e271fdcb01: jl 0x16e271fdcb1c
0x16e271fdcb03: movq -0x18(%rbx), %rbx
0x16e271fdcb07: movl %ecx, 0x33(%rbx)
0x16e271fdcb0a: retq
0x16e272044654: movq -0x18(%rbp), %r14
0x16e272044658: movq -0x20(%rbp), %r12
0x16e27204465c: shrq $0x20, %r12
0x16e272044660: movzbl (%r14,%r12), %ebx
0x16e272044665: cmpb $-0x69, %bl
0x16e272044668: je 0x16e2720446a3
0x16e2720446a3: movq -0x18(%rbp), %rbx
0x16e2720446a7: movl 0x2b(%rbx), %ebx
0x16e2720446aa: leave // will leave the function
0x16e2720446ab: popq %rcx
0x16e2720446ac: addq %rbx, %rsp
0x16e2720446af: pushq %rcx
0x16e2720446b0: retq
0x16e272041959: cmpq $0x1c, -0x8(%rbp)
0x16e27204195e: je 0x16e272041970
0x16e272041970: movq %rbp, %rsp
0x16e272041973: popq %rbp
0x16e272041974: retq
0x16e271f840ff: movabsq $0x105808c08, %r10 ; imm = 0x105808C08
0x16e271f84109: popq (%r10)
0x16e271f8410c: addq $0x0, %rsp
0x16e271f84110: popq %rbx
0x16e271f84111: cmpq $0x2, %rbx
0x16e271f84115: jne 0x16e271f8412c
0x16e271f8411b: movabsq $0x105808c20, %r10 ; imm = 0x105808C20
0x16e271f84125: movq $0x0, (%r10)
0x16e271f8412c: movabsq $0x105808c00, %r10 ; imm = 0x105808C00
0x16e271f84136: popq (%r10)
0x16e271f84139: popq %rbx // pop callee saved registers
0x16e271f8413a: popq %r15
0x16e271f8413c: popq %r14
0x16e271f8413c: popq %r14
0x16e271f8413e: popq %r13
0x16e271f84140: popq %r12
0x16e271f84142: addq $0x10, %rsp
0x16e271f84146: popq %rbp
0x16e271f84147: retq
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
value
is the function of type JSFunction:
(lldb) job JSFunction::cast(value)->code()
(lldb) job JSFunction::cast(value)->abstract_code()
(lldb) job JSFunction::cast(value)->feedback_vector()
Just to recall, we called this function:
Local<Value> f_value = ExecuteString(env, MainSource(env), script_name);
To compile boostrap_node.js and the returned value if a Function that has been compiled. This is later called:
auto ret = f->Call(env->context(), Null(env->isolate()), 1, &arg);
This will once again land us in Invoke
and:
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
This is a function call so we will first entry the JSEntryStub, then the InterpreterTrampoline
which will delegate to Generate_Call
0x3db5b98c1954: callq 0x3db5b98bc400
0x3db5b98bc400: testb $0x1, %dil // __ JumpIfSmi(rdi, &non_callable);
0x3db5b98bc404: je 0x3db5b98bc450 // __ JumpIfSmi(rdi, &non_callable);
0x3db5b98bc40a: movq -0x1(%rdi), %rcx // __ CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx);
0x3db5b98bc40e: cmpb $-0x1, 0xb(%rcx) // CmpInstanceType(map, type) called by CmpObjectType
0x3db5b98bc412: je 0x3db5b98bbe20 // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
// So the next instruction should match that from CallFunction.
0x3db5b98bbe20: testb $0x1, %dil //
Assembler:j(Condition cc, Handle<Code> target, RelocInfo::Mode rmode)
Can be found in src/x64/assembler-x64.cc
(line 1367):
void Assembler::j(Condition cc, Handle<Code> target, RelocInfo::Mode rmode) {
EnsureSpace ensure_space(this);
DCHECK(is_uint4(cc));
// 0000 1111 1000 tttn #32-bit disp.
emit(0x0F);
emit(0x80 | cc);
emit_code_target(target, rmode);
}
``
Take this expression as an example:
```c++
__ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
Condition
is an enum found in src/x64/assembler-x64.h
:
enum Condition {
// any value < 0 is considered no_condition
no_condition = -1,
overflow = 0,
no_overflow = 1,
below = 2,
above_equal = 3,
equal = 4,
not_equal = 5,
below_equal = 6,
above = 7,
negative = 8,
positive = 9,
parity_even = 10,
parity_odd = 11,
less = 12,
greater_equal = 13,
less_equal = 14,
greater = 15,
// Fake conditions that are handled by the
// opcodes using them.
always = 16,
never = 17,
// aliases
carry = below,
not_carry = above_equal,
zero = equal,
not_zero = not_equal,
sign = negative,
not_sign = positive,
last_condition = greater
};
The jump instruction generated for this would look like:
(lldb) dis -f -b
-> 0x16e27203c412: 0f 84 08 fa ff ff je 0x16e27203be20
emit(0x0F);
emit(0x80 | cc);
emit_code_target(target, rmode);
So we can see that 0f
matches the emit(0x0F)
call.
And 84
matches emit(0x80 | 4)
which is 84 in hex. (4 is Condition::equal)
I was wondering about (0x80 | cc)
and what that does. Well this will determin the type of jmp
opcode to emit. A near jmp opcode consists of two bytes, the first being with 0F
and the second
varies depending on the type of jmp. So lets take Condition::not_equal which is 5:
(lldb) expr -f hex -- `0x80 | 5`
(int) $304 = 0x00000085
The last line, emit_code_target(target, rmode)
which can be found in src/x64/assembler-x64-inl.h
.
int current = static_cast<int>(code_targets_.size());
expr isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->address()
(v8::internal::Address) $306 = 0x0000302f3413c3a0
x64 jump instructions
Instruction Description Flags short jump opcode near jump opcodes
- JO Jump if overflow ZF = 1 70 0F 80
- JE/JZ Jump if equal/Jump if zero ZF = 1 74 0F 84
- JNE/JNZ Jump if not equal/Jump if no zero ZF = 1 75 0F 85
Types of Jumps
A short jmp
is encoded as two bytes 74 and the number of bytes +/- relative to the instruction pointer. The operand can
only be a 8-bit operand. So this can jump -126 to +129 bytes.
A near jmp
allows for jumps in the current segment and uses 0F 84 and the operand is a 16-bit operand
So a function is translated into bytecode by the ByteCodeGenerator which will visit the AST and emit bytecodes for each AST node. The bytecodes are set on the SharedFunctionInfo object and the code entry address is set to the InterpreterEntryTrampoline builtin stub. InterpreterEntryTrampoline will set up the stack frame and then dispatch to the interpreter's bytecode handler for the functions first bytecode. I think this is what the last call is doing above. So what is the first bytecode in our case?
(lldb) up 2
(lldb) job v8::internal::JSFunction::cast(func)->abstract_code()
0x37b34d8adfe1: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
0 E> 0x37b34d8ae01a @ 0 : 93 StackCheck
299 S> 0x37b34d8ae01b @ 1 : 6f 00 00 00 CreateClosure [0], [0], #0
0x37b34d8ae01f @ 5 : 1e fb Star r0
21946 S> 0x37b34d8ae021 @ 7 : 97 Return
Constant pool (size = 1)
0x37b34d8adfc9: [FixedArray] in OldSpace
- map = 0x37b31cc022f1 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x37b34d8adf19 <SharedFunctionInfo>
Handler Table (size = 16)
So at this point we have the JSEntryStub which will take information from the isolate->isolate->thread_local_top_ and make it available to the rest of the code that will be called. This is something that has to be done each time a function is entered. Now, the first InterpreterEntryTrampoline is called and hopefully we'll be able to verify that this goes through the func->abstract_code() and compiles it.
(lldb) job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny)) 0x3db5b98bbdc1: [Code]
kind = BUILTIN
name = CallFunction_ReceiverIsAny
compiler = unknown
Instructions (size = 510)
0x3db5b98bbe20 0 40f6c701 testb rdi,0x1 // __ AssertFunction
0x3db5b98bbe24 4 7510 jnz 0x3db5b98bbe36 (CallFunction_ReceiverIsAny)
0x3db5b98bbe26 6 48ba0000000036000000 REX.W movq rdx,0x3600000000
0x3db5b98bbe30 10 e88b9b0200 call 0x3db5b98e59c0 (Abort) ;; code: BUILTIN
0x3db5b98bbe35 15 cc int3l
0x3db5b98bbe36 16 57 push rdi
0x3db5b98bbe37 17 488b7fff REX.W movq rdi,[rdi-0x1]
0x3db5b98bbe3b 1b 807f0bff cmpb [rdi+0xb],0xff
0x3db5b98bbe3f 1f 5f pop rdi
0x3db5b98bbe40 20 7410 jz 0x3db5b98bbe52 (CallFunction_ReceiverIsAny)
0x3db5b98bbe42 22 48ba000000003b000000 REX.W movq rdx,0x3b00000000
0x3db5b98bbe4c 2c e86f9b0200 call 0x3db5b98e59c0 (Abort) ;; code: BUILTIN
0x3db5b98bbe51 31 cc int3l // end __ AssertFunction
0x3db5b98bbe52 32 488b571f REX.W movq rdx,[rdi+0x1f]
Codestubs
Are declared in src/code-stubs.h
. Lets take a look at two CEntry
and JSEntry
:
#define CODE_STUB_LIST_ALL_PLATFORMS(V) \
/* --- PlatformCodeStubs --- */ \
...
V(CEntry) \
...
V(JSEntry) \
CodeStub extends ZoneObject. It has an enum named major with all the code stubs:
enum Major {
NoCache = 0,
...
CEntry,
...,
JSEntry,
};
Handle<Code> GetCode()
will return the Code for this code stub and generate it if needed.
If the code is not in the cache then the following will be called to generate the code:
Handle<Code> new_object = GenerateCode();
Zone
Is very well documented:
// The Zone supports very fast allocation of small chunks of
// memory. The chunks cannot be deallocated individually, but instead
// the Zone supports deallocating all chunks in one fast
// operation. The Zone is used to hold temporary data structures like
// the abstract syntax tree, which is deallocated after compilation.
ZoneObject
Is an object that exist in a zone and is indented to be extended (like CodeStub does).
Script::Compile
Will delegate to ScriptCompiler::Compile, and then to ScriptCompiler::CompileUnboundInternal, and then to ScriptCompiler::CompileUnboundInternal.
i::MaybeHandle<i::SharedFunctionInfo> maybe_function_info = i::Compiler::GetSharedFunctionInfoForScript(
-> 2332 str, name_obj, line_offset, column_offset, source->resource_options,
2333 source_map_url, isolate->native_context(), NULL, &script_data,
2334 options, i::NOT_NATIVES_CODE, host_defined_options);
ParseInfo parse_info(script);
Zone compile_zone(isolate->allocator(), ZONE_NAME);
...
maybe_result = CompileToplevel(&parse_info, isolate);
CompileToplevel
if (parse_info->literal() == nullptr && !parsing::ParseProgram(parse_info, isolate)) {
...
std::unique_ptr<CompilationJob> outer_function_job(GenerateUnoptimizedCode(parse_info, isolate, &inner_function_jobs));
...
GenerateUnoptimizedCode
src/compiler.cc
Compiler::EagerInnerFunctionLiterals inner_literals;
if (!Compiler::Analyze(parse_info, &inner_literals)) {
return std::unique_ptr<CompilationJob>();
}
std::unique_ptr<CompilationJob> outer_function_job(
PrepareAndExecuteUnoptimizedCompileJob(parse_info, parse_info->literal(), isolate));
PrepareAndExecuteUnoptimizedCompileJob
src/compiler.cc
(lldb) br s -f compiler.cc -l 385
std::unique_ptr<CompilationJob> job(interpreter::Interpreter::NewCompilationJob(parse_info, literal, isolate));
if (job->PrepareJob() == CompilationJob::SUCCEEDED && job->ExecuteJob() == CompilationJob::SUCCEEDED) {
return job;
}
Lets take a look at parse_info
:
(lldb) job parse_info->script_->source()
"'use strict';\x0a\x0a(function frogger(process) {\x0a process._rawDebug('entry function');\x0a\x0a function startup() {\x0a process._rawDebug('startup function');\x0a return true;\x0a }\x0a\x0a startup();\x0a});\x0a"
Notice that this is the complete contents of bootstrap_node.js:
(lldb) job parse_info->script_->name()
"bootstrap_node.js"
PrepareJob
will print the AST if configured with --print-ast
:
[generating bytecode for function: ]
--- AST ---
FUNC at 0
. KIND 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. EXPRESSION STATEMENT at 284
. . LITERAL "use strict"
. EXPRESSION STATEMENT at 299
. . ASSIGN at -1
. . . VAR PROXY local[0] (0x10702f338) (mode = TEMPORARY) ".result"
. . . FUNC LITERAL at 300
. . . . NAME
. . . . INFERRED NAME
. . . . PARAMS
. . . . . VAR (0x10701cd48) (mode = VAR) "process"
. RETURN at -1
. . VAR PROXY local[0] (0x10702f338) (mode = TEMPORARY) ".result"
VAR PROXY
indicates that this scope resolution will connect these nodes declaring VAR nodes.
This is all PrepareJob
does.
ExecuteJob()
will call:
return UpdateState(ExecuteJobImpl(), State::kReadyToFinalize);
The call to ExecuteJobImpl
will end up in interpreter.cc:191 in our case. In this function we find the following:
generator()->GenerateBytecode(stack_limit());
Now we are getting closer to figuring out how the bytecode is generated. This will land us in bytecode-generator.cc:894.
GenerateBytecode
:
InitializeAstVisitor(stack_limit);
ContextScope incoming_context(this, closure_scope());
RegisterAllocationScope register_scope(this);
AllocateTopLevelRegisters();
...
GenerateBytecodeBody();
GenerateBytecodeBody:
...
VisitDeclarations(closure_scope()->declarations()); // no declarations in our case
VisitModuleNamespaceImports();
...
builder()->StackCheck(info()->literal()->start_position());
VisitStatements(info()->literal()->body());
...
This will later call OutputStackCheck();
(lldb) p *node
(v8::internal::interpreter::BytecodeNode) $73 = {
bytecode_ = kStackCheck
operands_ = ([0] = 0, [1] = 0, [2] = 0, [3] = 0, [4] = 0)
operand_count_ = 0
operand_scale_ = kSingle
source_info_ = (position_type_ = kExpression, source_position_ = 0)
}
bytecode-array-writer.cc:60 will do the actual writing of the node:
UpdateSourcePositionTable(node);
EmitBytecode(node);
TODO: take a closer look at the source position table.
EmitBytecode
can be found in bytecode-array-writer.cc:192:
Bytecode bytecode = node->bytecode();
OperandScale operand_scale = node->operand_scale();
In this case bytecode is kStackCheck. Now, kStackCheck is an index into builtins_ array of the isolate:
(lldb) job *isolate->builtins()->builtin_handle(Builtins::Name::kStackCheck)
0x2a9270441da1: [Code]
kind = BUILTIN
name = StackCheck
compiler = unknown
Instructions (size = 17)
0x2a9270441e00 0 33c0 xorl rax,rax
0x2a9270441e02 2 48bb70de420101000000 REX.W movq rbx,0x10142de70 ;; external reference (Runtime::StackGuard)
0x2a9270441e0c c e92f29f4ff jmp 0x2a9270384740 ;; code: STUB, CEntryStub, minor: 8
RelocInfo (size = 3)
0x2a9270441e04 external reference (Runtime::StackGuard) (0x10142de70)
0x2a9270441e0d code target (STUB) (0x2a9270384740)
If you look closely the first instruction is just setting rax to zero using xor. Next, we are pushing the pointer to the function, in this case Runtime::StackGuard into rbx. We then jump to CEntryStub.
What is StackGuard
?
Well, it is defined in a macro in src/runtime/runtime.h:
...
F(StackGuard, 0, 1)
...
#define FOR_EACH_INTRINSIC(F) \
FOR_EACH_INTRINSIC_RETURN_PAIR(F) \
FOR_EACH_INTRINSIC_RETURN_OBJECT(F)
#define F(name, nargs, ressize) \
Object* Runtime_##name(int args_length, Object** args_object, \
Isolate* isolate);
FOR_EACH_INTRINSIC_RETURN_OBJECT(F)
#undef F
StackGuard
is included in FOR_EACH_INTRINSIC_INTERNAL
which is included by FOR_EACH_INTRINSIC_RETURN_OBJECT
.
So that should expand to:
Object* Runtime_StackGuard(int args_lentgh, Object** args_object, Isolate* isolate);
We can verify this using:
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))
(const Function *) $1058 = 0x00000001029f0340
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->name
(const char *const) $1059 = 0x0000000101c8be27 "StackGuard"
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->nargs
(int8_t) $1060 = '\0'
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->intrinsic_type
(const IntrinsicType) $1061 = RUNTIME
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->entry
(v8::internal::Address) $1063 = 0x000000010142de70 "UH\x89�H��P�}�H�u�H�U�H�}���\x14\x04��\x01H\x83�
If nargs
is -1
then the funnction takes a variable number of arguments.
We can also disassemble the address using:
(lldb) dis -s 0x000000010142de70
node`v8::internal::Runtime_StackGuard:
0x10142de70 <+0>: pushq %rbp
0x10142de71 <+1>: movq %rsp, %rbp
0x10142de74 <+4>: subq $0x50, %rsp
0x10142de78 <+8>: movl %edi, -0xc(%rbp)
0x10142de7b <+11>: movq %rsi, -0x18(%rbp)
0x10142de7f <+15>: movq %rdx, -0x20(%rbp)
0x10142de83 <+19>: movq -0x20(%rbp), %rdi
0x10142de87 <+23>: callq 0x10046f360 ; v8::internal::Isolate::context at isolate.h:596
0x10142de8c <+28>: movb $0x1, %cl
Now that we know this we can disassemble the complete function using:
(lldb) dis -n v8::internal::Runtime_StackGuard
As well as any other runtime function we might be interested in later.
Next, the body will be visited (BytecodeGenerator::VisitStatements():
void BytecodeGenerator::VisitStatements(ZoneList<Statement*>* statements) {
for (int i = 0; i < statements->length(); i++) {
// Allocate an outer register allocations scope for the statement.
RegisterAllocationScope allocation_scope(this);
Statement* stmt = statements->at(i);
Visit(stmt);
if (stmt->IsJump()) break;
}
}
This this case we have three statements:
(lldb) p statements->length()
(int) $99 = 3
(lldb) p stmt->Print()
EXPRESSION STATEMENT at 284
. LITERAL "use strict"
So the is one statement for 'use strict';
The next statement is:
(lldb) p stmt->Print()
EXPRESSION STATEMENT at 299
. ASSIGN at -1
. . VAR PROXY local[0] (0x104892d38) (mode = TEMPORARY) ".result"
. . FUNC LITERAL at 300
. . . NAME
. . . INFERRED NAME
. . . PARAMS
. . . . VAR (0x104880748) (mode = VAR) "process"
This matches the function literal:
(function(process) {
});
This will call BytecodeGenerator::VisitFunctionLiteral
(src/interpreter/bytecode-generator.cc):
void BytecodeGenerator::VisitFunctionLiteral(FunctionLiteral* expr) {
DCHECK_EQ(expr->scope()->outer_scope(), current_scope());
uint8_t flags = CreateClosureFlags::Encode(
expr->pretenure(), closure_scope()->is_function_scope());
size_t entry = builder()->AllocateDeferredConstantPoolEntry();
int slot_index = feedback_index(expr->LiteralFeedbackSlot());
builder()->CreateClosure(entry, slot_index, flags);
function_literals_.push_back(std::make_pair(expr, entry));
}
Now, my main interest is builder()->CreateClosure
which will delegate to BytecodeArrayBuilder::CreateClosure
:
OutputCreateClosure(shared_function_info_entry, slot, flags);
This will output a BytecodeNode that looks like this:
(v8::internal::interpreter::BytecodeNode) $875 = {
bytecode_ = kCreateClosure
operands_ = ([0] = 32767, [1] = 228, [2] = 0, [3] = 0, [4] = 0)
operand_count_ = 3
operand_scale_ = kSingle
source_info_ = (position_type_ = kStatement, source_position_ = 299)
}
The third and last statement is:
(lldb) p stmt->Print()
RETURN at -1
. VAR PROXY local[0] (0x104892d38) (mode = TEMPORARY) ".result"
I'm guessing that this is the return of the function literal.
After this the job will be returned and we have completed the outer function job. This will land us back in GenerateUnoptimizedCode
:
(compiler.cc:415
for (auto it : inner_literals) {
FunctionLiteral* inner_literal = it->value();
std::unique_ptr<CompilationJob> inner_job(
PrepareAndExecuteUnoptimizedCompileJob(parse_info, inner_literal,
isolate));
if (!inner_job) return std::unique_ptr<CompilationJob>();
inner_function_jobs->emplace_front(std::move(inner_job));
} `
The first inner_literal is:
(lldb) p inner_literal->Print()
FUNC LITERAL at 300
. NAME
. INFERRED NAME
. PARAMS
. . VAR (0x10701cd48) (mode = VAR) "process"
This matches function(process) {}
in bootstrap_node.js. The AST will look like this:
[generating bytecode for function: ]
--- AST ---
FUNC at 0
. KIND 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. EXPRESSION STATEMENT at 284
. . LITERAL "use strict"
. EXPRESSION STATEMENT at 299
. . ASSIGN at -1
. . . VAR PROXY local[0] (0x10702f338) (mode = TEMPORARY) ".result"
. . . FUNC LITERAL at 300
. . . . NAME
. . . . INFERRED NAME
. . . . PARAMS
. . . . . VAR (0x10701cd48) (mode = VAR) "process"
. RETURN at -1
. . VAR PROXY local[0] (0x10702f338) (mode = TEMPORARY) ".result"
This function will require a context (in BytecodeGenerator::GenerateBytecode):
if (closure_scope()->NeedsContext()) {
BuildNewLocalActivationContext();
ContextScope local_function_context(this, closure_scope());
BuildLocalActivationContextInitialization();
GenerateBytecodeBody();
}
BytecodeGenerator::BuildNewLocalActivationContext
:
builder()->CreateFunctionContext(slot_count);
This will delegate to OutputCreateFunctionContext(slots)
(lldb) p *node
(v8::internal::interpreter::BytecodeNode) $933 = {
bytecode_ = kCreateFunctionContext
operands_ = ([0] = 21, [1] = 0, [2] = 0, [3] = 0, [4] = 0)
operand_count_ = 1
operand_scale_ = kSingle
source_info_ = (position_type_ = kNone, source_position_ = -1)
}
BytecodeGenerator::BuildLocalActivationContextInitialization
:
The OutputCreateFunctionContext(slots) will add bytecodes for:
(lldb) p *node
(v8::internal::interpreter::BytecodeNode) $173 = {
bytecode_ = kCreateFunctionContext
operands_ = ([0] = 21, [1] = 0, [2] = 0, [3] = 0, [4] = 0)
operand_count_ = 1
operand_scale_ = kSingle
source_info_ = (position_type_ = kNone, source_position_ = -1)
BuildLocalActivationContextInitalization()
will set up the parameters:
(lldb) p num_parameters
(int) $186 = 1
(lldb) expr *variable->name_
(const v8::internal::AstRawString) $187 = {
= {
next_ = 0x0000000104857060
string_ = 0x0000000104857060
}
literal_bytes_ = (start_ = "process", length_ = 7)
hash_field_ = 2818393206
is_one_byte_ = true
has_string_ = true
}
So we can see that this is the process parameter.
builder()->LoadAccumulatorWithRegister(parameter).StoreContextSlot(
execution_context()->reg(), variable->index(), 0);
VisitVariableDeclaration: (lldb) p *variable->name_ (const v8::internal::AstRawString) $247 = { = { next_ = 0x0000000104857068 string_ = 0x0000000104857068 } literal_bytes_ = (start_ = "internalBinding", length_ = 15) hash_field_ = 2406209186 is_one_byte_ = true has_string_ = true } Next vars are:
#exceptionHandlerState
#startup
#setupProcessObject
...
builder()->StackCheck(info()->literal()->start_position()); VisitStatements(info()->literal()->body());
(lldb) expr stmt->Print()
BLOCK NOCOMPLETIONS at -1
. EXPRESSION STATEMENT at 326
. . INIT at 326
. . . VAR PROXY context[5] (0x1048808a0) (mode = LET) "internalBinding"
. . . LITERAL undefined
After all the AST nodes have been visisted and compiled into bytecode we will be back in GenerateUnoptimizedCode
where we were compiling all the inner statments of the outer code. The outer_function_job is then returned to CompileToLevel
:
(src/compiler.cc:793)
parse_info->ast_value_factory()->Internalize(isolate);
EnsureSharedFunctionInfosArrayOnScript(parse_info, isolate);
Handle<SharedFunctionInfo> shared_info =
isolate->factory()->NewSharedFunctionInfoForLiteral(parse_info->literal(),
parse_info->script());
Handle<SharedFunctionInfo> Factory::NewSharedFunctionInfoForLiteral
:
(factory.cc)
Handle<Code> code = BUILTIN_CODE(isolate(), CompileLazy);
Handle<ScopeInfo> scope_info(ScopeInfo::Empty(isolate()));
Handle<SharedFunctionInfo> result = NewSharedFunctionInfo(literal->name(), literal->kind(), code, scope_info);
So notice that the code for this function literal will be CompileLazy
CompileLazy
Can be found in src/builtins/builtins-definitions.h
:
ASM(CompileLazy)
And the code that generates this builtin can be found in src/builtins/x64/builtins-x64.cc
. The builtins are documented above but
just remember that they are generated at compile time and then serialized into the snapshot and deserialized upon startup.
opcode mapping:
// Register closure = rdi;
// Register feedback_vector = rbx;
0xa7cbb3c50e1: [Code]
kind = BUILTIN
name = CompileLazy
compiler = unknown
Instructions (size = 983)
0xa7cbb3c5140 0 488b5f2f REX.W movq rbx,[rdi+0x2f] // __ movp(feedback_vector, FieldOperand(closure, JSFunction::kFeedbackVectorOffset));
0xa7cbb3c5144 4 488b5b07 REX.W movq rbx,[rbx+0x7] // __ movp(feedback_vector, FieldOperand(feedback_vector, Cell::kValueOffset));
0xa7cbb3c5148 8 493b5da0 REX.W cmpq rbx,[r13-0x60] // __ JumpIfRoot(feedback_vector, Heap::kUndefinedValueRootIndex, &gotta_call_runtime); (src/x86/macro-assembler.h) CompareRoot(with, index);
0xa7cbb3c514c c 0f844c030000 jz 0xa7cbb3c549e (CompileLazy) // __ JumpIfRoot(feedback_vector, Heap::kUndefinedValueRootIndex, &gotto_call_runtime); j(equal, if_equal, if_equal_distance)
// MaybeTailCallOptimizedCodeSlot(masm, feedback_vector, rcx, r14, r15);
// static void MaybeTailCallOptimizedCodeSlot(MacroAssembler* masm, Register feedback_vector, Register scratch1, Register scratch2, Register scratch3)
// Register closure = rdi;
// Register optimized_code_entry = scratch1; // rcx
0xa7cbb3c5152 12 488b4b0f REX.W movq rcx,[rbx+0xf] // __ movp(optimized_code_entry, FieldOperand(feedback_vector, FeedbackVector::kOptimizedCodeOffset));
0xa7cbb3c5156 16 f6c101 testb rcx,0x1 // __ JumpIfNotSmi(optimized_code_entry, &optimized_code_slot_is_cell): CheckSMI
0xa7cbb3c5159 19 0f8591010000 jnz 0xa7cbb3c52f0 (CompileLazy // __ JumpIfNotSmi(optimized_code_entry, &optimized_code_slot_is_cell): j(NegateCondition(smi), on_not_smi, near_jump);
// TailCallRuntimeIfMarkerEquals:730
0xa7cbb3c515f 1f f6c101 testb rcx,0x1 // __ SmiCompare(smi_entry, Smi::FromEnum(marker));
0xa7cbb3c5162 22 7410 jz 0xa7cbb3c5174 (CompileLazy) // __ j(not_equal, &no_match, Label::kNear);
0xa7cbb3c5164 24 48ba000000003d000000 REX.W movq rdx,0x3d00000000 //
0xa7cbb3c5192 52 49ba0000000001000000 REX.W movq r10,0x100000000 // __ movp(entry, FieldOperand(closure, JSFunction::kSharedFunctionInfoOffset)); entry =
`NewSharedFunctionInfo` will delegate to
Handle<SharedFunctionInfo> shared = NewSharedFunctionInfo(name, code, IsConstructable(kind), kind);
Is is that the inner functions will have a code of CompileLazy and the outer InterpreterEntryTrampoline?
I we assume that InterpreterEntryTrampoline
```console
SharedFunctionInfo: 0x188c4f2adf69 <SharedFunctionInfo>
Optimized Code: 0
Invocation Count: 0
Profiler Ticks: 0
Slot #0 kCreateClosure
[0]: 0x188c4f2b0a81 <Cell value= 0x188cc2d822e1 <undefined>>
There is no optimized code for this yet. This function has not been called before.
### void BytecodeGenerator::GenerateBytecode(uintptr_t stack_limit)
For an interpreted function interpreter/bytecode-generator.h ExecuteJobImpl will be called by
`src/interpreter/bytecode-generator.cc` is what actual generated the bytecode.
Ast for the outer most function in bootstrap_node.js:
```console
FUNC at 0
. KIND 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. EXPRESSION STATEMENT at 284
. . LITERAL "use strict"
. EXPRESSION STATEMENT at 299
. . ASSIGN at -1
. . . VAR PROXY local[0] (0x104892d38) (mode = TEMPORARY) ".result"
. . . FUNC LITERAL at 300
. . . . NAME
. . . . INFERRED NAME
. . . . PARAMS
. . . . . VAR (0x104880748) (mode = VAR) "process"
. RETURN at -1
. . VAR PROXY local[0] (0x104892d38) (mode = TEMPORARY) ".result"
(lldb) job v8::internal::JSFunction::cast(func)->abstract_code() 0x1ca547f2e031: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
0 E> 0x1ca547f2e06a @ 0 : 93 StackCheck
299 S> 0x1ca547f2e06b @ 1 : 6f 00 00 00 CreateClosure [0], [0], #0
0x1ca547f2e06f @ 5 : 1e fb Star r0
21946 S> 0x1ca547f2e071 @ 7 : 97 Return
Constant pool (size = 1)
0x1ca547f2e019: [FixedArray] in OldSpace
- map = 0x1ca516b822f1 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x1ca547f2df69 <SharedFunctionInfo>
Above we see the generated bytecode for func. This was generated by parsing the javascipt code into the AST.
(lldb) job v8::internal::JSFunction::cast(func)->code()
0x3a9296944281: [Code]
kind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x3a92969442e0 0 488b5f2f REX.W movq rbx,[rdi+0x2f]
0x3a92969442e4 4 488b5b07 REX.W movq rbx,[rbx+0x7]
0x3a92969442e8 8 488b4b0f REX.W movq rcx,[rbx+0xf]
0x3a92969442ec c f6c101 testb rcx,0x1
0x3a92969442ef f 0f8591010000 jnz 0x3a9296944486 (InterpreterEntryTrampoline)
0x3a92969442f5 15 f6c101 testb rcx,0x1
0x3a92969442f8 18 7410 jz 0x3a929694430a (InterpreterEntryTrampoline)
0x3a92969442fa 1a 48ba000000003d000000 REX.W movq rdx,0x3d00000000
0x3a9296944304 24 e8b7160200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a9296944309 29 cc int3l
0x3a929694430a 2a 4885c9 REX.W testq rcx,rcx
0x3a929694430d 2d 0f8486020000 jz 0x3a9296944599 (InterpreterEntryTrampoline)
0x3a9296944313 33 f6c101 testb rcx,0x1
0x3a9296944316 36 7410 jz 0x3a9296944328 (InterpreterEntryTrampoline)
0x3a9296944318 38 48ba000000003d000000 REX.W movq rdx,0x3d00000000
0x3a9296944322 42 e899160200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a9296944327 47 cc int3l
0x3a9296944328 48 49ba0000000001000000 REX.W movq r10,0x100000000
0x3a9296944332 52 493bca REX.W cmpq rcx,r10
0x3a9296944335 55 7579 jnz 0x3a92969443b0 (InterpreterEntryTrampoline)
0x3a9296944337 57 55 push rbp
0x3a9296944338 58 4889e5 REX.W movq rbp,rsp
0x3a929694433b 5b 6a1c push 0x1c
0x3a929694433d 5d 49ba81429496923a0000 REX.W movq r10,0x3a9296944281 (InterpreterEntryTrampoline) ;; object: 0x3a9296944281 <Code BUILTIN>
0x3a9296944347 67 4152 push r10
0x3a9296944349 69 49bae122b8d7a51c0000 REX.W movq r10,0x1ca5d7b822e1 ;; object: 0x1ca5d7b822e1 <undefined>
0x3a9296944353 73 4c391424 REX.W cmpq [rsp],r10
0x3a9296944357 77 7510 jnz 0x3a9296944369 (InterpreterEntryTrampoline)
0x3a9296944359 79 48ba0000000009000000 REX.W movq rdx,0x900000000
0x3a9296944363 83 e858160200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a9296944368 88 cc int3l
0x3a9296944369 89 48c1e020 REX.W shlq rax, 32
0x3a929694436d 8d 50 push rax
0x3a929694436e 8e 57 push rdi
0x3a929694436f 8f 52 push rdx
0x3a9296944370 90 57 push rdi
0x3a9296944371 91 b801000000 movl rax,0x1
0x3a9296944376 96 48bb90453e0101000000 REX.W movq rbx,0x1013e4590 ;; external reference (Runtime::CompileOptimized_NotConcurrent)
0x3a9296944380 a0 e8bb03f4ff call 0x3a9296884740 ;; code: STUB, CEntryStub, minor: 8
0x3a9296944385 a5 488bd8 REX.W movq rbx,rax
0x3a9296944388 a8 5a pop rdx
0x3a9296944389 a9 5f pop rdi
0x3a929694438a aa 58 pop rax
0x3a929694438b ab 48c1e820 REX.W shrq rax, 32
0x3a929694438f af 48837df81c REX.W cmpq [rbp-0x8],0x1c
0x3a9296944394 b4 7410 jz 0x3a92969443a6 (InterpreterEntryTrampoline)
0x3a9296944396 b6 48ba000000004f000000 REX.W movq rdx,0x4f00000000
0x3a92969443a0 c0 e81b160200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a92969443a5 c5 cc int3l
0x3a92969443a6 c6 488be5 REX.W movq rsp,rbp
0x3a92969443a9 c9 5d pop rbp
0x3a92969443aa ca 488d5b5f REX.W leaq rbx,[rbx+0x5f]
0x3a92969443ae ce ffe3 jmp rbx
0x3a92969443b0 d0 f6c101 testb rcx,0x1
0x3a92969443b3 d3 7410 jz 0x3a92969443c5 (InterpreterEntryTrampoline)
0x3a92969443b5 d5 48ba000000003d000000 REX.W movq rdx,0x3d00000000
0x3a92969443bf df e8fc150200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a92969443c4 e4 cc int3l
0x3a92969443c5 e5 49ba0000000002000000 REX.W movq r10,0x200000000
0x3a92969443cf ef 493bca REX.W cmpq rcx,r10
0x3a92969443d2 f2 7579 jnz 0x3a929694444d (InterpreterEntryTrampoline)
0x3a92969443d4 f4 55 push rbp
0x3a92969443d5 f5 4889e5 REX.W movq rbp,rsp
0x3a92969443d8 f8 6a1c push 0x1c
0x3a92969443da fa 49ba81429496923a0000 REX.W movq r10,0x3a9296944281 (InterpreterEntryTrampoline) ;; object: 0x3a9296944281 <Code BUILTIN>
0x3a92969443e4 104 4152 push r10
0x3a92969443e6 106 49bae122b8d7a51c0000 REX.W movq r10,0x1ca5d7b822e1 ;; object: 0x1ca5d7b822e1 <undefined>
0x3a92969443f0 110 4c391424 REX.W cmpq [rsp],r10
0x3a92969443f4 114 7510 jnz 0x3a9296944406 (InterpreterEntryTrampoline)
0x3a92969443f6 116 48ba0000000009000000 REX.W movq rdx,0x900000000
0x3a9296944400 120 e8bb150200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a9296944405 125 cc int3l
0x3a9296944406 126 48c1e020 REX.W shlq rax, 32
0x3a929694440a 12a 50 push rax
0x3a929694440b 12b 57 push rdi
0x3a929694440c 12c 52 push rdx
0x3a929694440d 12d 57 push rdi
0x3a929694440e 12e b801000000 movl rax,0x1
0x3a9296944413 133 48bb00403e0101000000 REX.W movq rbx,0x1013e4000 ;; external reference (Runtime::CompileOptimized_Concurrent)
0x3a929694441d 13d e81e03f4ff call 0x3a9296884740 ;; code: STUB, CEntryStub, minor: 8
0x3a9296944422 142 488bd8 REX.W movq rbx,rax
0x3a9296944425 145 5a pop rdx
0x3a9296944426 146 5f pop rdi
0x3a9296944427 147 58 pop rax
0x3a9296944428 148 48c1e820 REX.W shrq rax, 32
0x3a929694442c 14c 48837df81c REX.W cmpq [rbp-0x8],0x1c
0x3a9296944431 151 7410 jz 0x3a9296944443 (InterpreterEntryTrampoline)
0x3a9296944433 153 48ba000000004f000000 REX.W movq rdx,0x4f00000000
0x3a929694443d 15d e87e150200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN
0x3a9296944442 162 cc int3l
0x3a9296944443 163 488be5 REX.W movq rsp,rbp
0x3a9296944446 166 5d pop rbp
0x3a9296944447 167 488d5b5f REX.W leaq rbx,[rbx+0x5f]
0x3a92969445d4 2f4 48ba000000001e000000 REX.W movq rdx,0x1e00000000 // __ Assert(equal, kFunctionDataShouldBeBytecodeArrayOnInterpreterEntry); -> Check(cc, reason); -> Abort -> Move(rdx, Smi::FromInt(static_cast<int>(reason)));
0x3a92969445de 2fe e8dd130200 call 0x3a92969659c0 (Abort) ;; code: BUILTIN // __ Assert(equal, kFunctionDataShouldBeBytecodeArrayOnInterpreterEntry); -> Check(cc, reason); -> Abort -> Call(BUILTIN_CODE(isolate(), Abort), RelocInfo::CODE_TARGET);
0x3a92969445e3 303 cc int3l // __ Assert(equal, kFunctionDataShouldBeBytecodeArrayOnInterpreterEntry); -> Check(cc, reason); -> Abort -> int3 "instruction trap 3" is intended for calling the debug exception handler
0x3a92969445e4 304 41c6463800 movb [r14+0x38],0x0 // __ movb(FieldOperand(kInterpreterBytecodeArrayRegister, BytecodeArray::kBytecodeAgeOffset), Immediate(BytecodeArray::kNoAgeBytecodeAge));
0x3a92969445e9 309 49c7c439000000 REX.W movq r12,0x39 // __ movp(kInterpreterBytecodeOffsetRegister, Immediate(BytecodeArray::kHeaderSize - kHeapObjectTag));
0x3a92969445f0 310 4156 push r14 // __ Push(kInterpreterBytecodeArrayRegister);
0x3a92969445f2 312 4489e1 movl rcx,r12 // __ Integer32ToSmi(rcx, kInterpreterBytecodeOffsetRegister); -> movl(dst, src);
0x3a92969445f5 315 48c1e120 REX.W shlq rcx, 32 // __ Integer32ToSmi(rcx, kInterpreterBytecodeOffsetRegister); -> shlp(dst, Immediate(kSmiShift));
0x3a92969445f9 319 51 push rcx // __ Push(rcx);
0x3a92969445fa 31a 418b4e27 movl rcx,[r14+0x27] // __ movl(rcx, FieldOperand(kInterpreterBytecodeArrayRegister, BytecodeArray::kFrameSizeOffset));
0x3a92969445fe 31e 4889e0 REX.W movq rax,rsp // __ movp(rax, rsp);
0x3a9296944601 321 482bc1 REX.W subq rax,rcx // __ subp(rax, rcx);
0x3a9296944604 324 493b85080d0000 REX.W cmpq rax,[r13+0xd08] // __ CompareRoot(rax, Heap::kRealStackLimitRootIndex);
0x3a929694460b 32b 7311 jnc 0x3a929694461e (InterpreterEntryTrampoline) // __ j(above_equal, &ok, Label::kNear);
0x3a929694460d 32d 33c0 xorl rax,rax // __ CallRuntime(Runtime::kThrowStackOverflow); -> Set(rax, num_arguments);
0x3a929694460f 32f 48bb306c420101000000 REX.W movq rbx,0x101426c30 // __ CallRuntime(Runtime::kThrowStackOverflow); -> Set(rax, num_arguments) -> LoadAddress(rbx, ExternalReference(f, isolate()));
0x3a9296944619 339 e82201f4ff call 0x3a9296884740 ;; code: STUB, CEntryStub // __ CallRuntime(Runtime::kThrowStackOverflow); -> Set(rax, num_arguments) -> LoadAddress(rbx, ExternalReference(f, isolate())) -> CEntryStub ces(isolate(), f->result_size, save_doubles); CallStub(&ces);
0x3a929694461e 33e 498b45a0 REX.W movq rax,[r13-0x60] // __ LoadRoot(rax, Heap::kUndefinedValueRootIndex);
0x3a9296944622 342 e901000000 jmp 0x3a9296944628 (InterpreterEntryTrampoline) // __ j(always, &loop_check);
0x3a9296944627 347 50 push rax // __ Push(rax);
0x3a9296944628 348 4883e908 REX.W subq rcx,0x8 // __ subp(rcx, Immediate(kPointerSize));
0x3a929694462c 34c 7df9 jge 0x3a9296944627 (InterpreterEntryTrampoline) // __ j(greater_equal, &loop_header, Label::kNear);
0x3a929694462e 34e 4963462f REX.W movsxlq rax,[r14+0x2f] // __ movsxlq(rax, FieldOperand(kInterpreterBytecodeArrayRegister, BytecodeArray::kIncomingNewTargetOrGeneratorRegisterOffset));
0x3a9296944632 352 85c0 testl rax,rax // __ testl(rax, rax);
0x3a9296944634 354 7405 jz 0x3a929694463b (InterpreterEntryTrampoline) // __ j(zero, &no_incoming_new_target_or_generator_register, Label::kNear);
0x3a9296944636 356 488954c500 REX.W movq [rbp+rax*8+0x0],rdx // __ movp(Operand(rbp, rax, times_pointer_size, 0), rdx);
0x3a929694463b 35b 498b45a0 REX.W movq rax,[r13-0x60] // __ LoadRoot(kInterpreterAccumulatorRegister, Heap::kUndefinedValueRootIndex);
0x3a929694463f 35f 49bf1062040601000000 REX.W movq r15,0x106046210 // __ Move(kInterpreterDispatchTableRegister, ExternalReference::interpreter_dispatch_table_address(masm->isolate()));
0x3a9296944649 369 430fb61c26 movzxbl rbx,[r14+r12*1] // __ movzxbp(rbx, Operand(kInterpreterBytecodeArrayRegister, kInterpreterBytecodeOffsetRegister, times_1, 0));
0x3a929694464e 36e 498b1cdf REX.W movq rbx,[r15+rbx*8] // __ movp(rbx, Operand(kInterpreterDispatchTableRegister, rbx, times_pointer_size, 0));
0x3a9296944652 372 ffd3 call rbx // __ call(rbx); this dispatched to the bytecode handler. What is the bytecode entry handler in this case? Is it CreateClosure?
// Yes, this call will eventually end up in Runtime_InterpreterNewClosure
0x3a9296944654 374 4c8b75e8 REX.W movq r14,[rbp-0x18]
0x3a9296944658 378 4c8b65e0 REX.W movq r12,[rbp-0x20]
0x3a929694465c 37c 49c1ec20 REX.W shrq r12, 32
0x3a9296944660 380 430fb61c26 movzxbl rbx,[r14+r12*1]
0x3a9296944665 385 80fb97 cmpb bl,0x97
0x3a9296944668 388 7439 jz 0x3a92969446a3 (InterpreterEntryTrampoline)
0x3a929694466a 38a 48b9a022db0101000000 REX.W movq rcx,0x101db22a0 ;; external reference (Bytecodes::bytecode_size_table_address)
0x3a9296944674 394 80fb01 cmpb bl,0x1
0x3a9296944677 397 7724 ja 0x3a929694469d (InterpreterEntryTrampoline)
0x3a9296944679 399 7411 jz 0x3a929694468c (InterpreterEntryTrampoline)
0x3a929694467b 39b 41ffc4 incl r12
0x3a929694467e 39e 430fb61c26 movzxbl rbx,[r14+r12*1]
0x3a9296944683 3a3 4881c1ac020000 REX.W addq rcx,0x2ac
0x3a929694468a 3aa eb11 jmp 0x3a929694469d (InterpreterEntryTrampoline)
0x3a929694468c 3ac 41ffc4 incl r12
0x3a929694468f 3af 430fb61c26 movzxbl rbx,[r14+r12*1]
0x3a9296944694 3b4 4881c158050000 REX.W addq rcx,0x558
0x3a929694469b 3bb eb00 jmp 0x3a929694469d (InterpreterEntryTrampoline)
0x3a929694469d 3bd 44032499 addl r12,[rcx+rbx*4]
0x3a92969446a1 3c1 eb9c jmp 0x3a929694463f (InterpreterEntryTrampoline)
0x3a92969446a3 3c3 488b5de8 REX.W movq rbx,[rbp-0x18]
0x3a92969446a7 3c7 8b5b2b movl rbx,[rbx+0x2b]
0x3a92969446aa 3ca c9 leavel
0x3a92969446ab 3cb 59 pop rcx
0x3a92969446ac 3cc 4803e3 REX.W addq rsp,rbx
0x3a92969446af 3cf 51 push rcx
0x3a92969446b0 3d0 c3 retl
0x3a92969446b1 3d1 488b4847 REX.W movq rcx,[rax+0x47]
0x3a92969446b5 3d5 448b512b movl r10,[rcx+0x2b]
0x3a92969446b9 3d9 41f6c201 testb r10,0x1
0x3a92969446bd 3dd 0f84eefeffff jz 0x3a92969445b1 (InterpreterEntryTrampoline)
0x3a92969446c3 3e3 4c8b7117 REX.W movq r14,[rcx+0x17]
0x3a92969446c7 3e7 e9e5feffff jmp 0x3a92969445b1 (InterpreterEntryTrampoline)
RelocInfo (size = 39)
0x3a9296944305 code target (BUILTIN) (0x3a92969659c0)
0x3a9296944323 code target (BUILTIN) (0x3a92969659c0)
0x3a929694433f embedded object (0x3a9296944281 <Code BUILTIN>)
0x3a929694434b embedded object (0x1ca5d7b822e1 <undefined>)
0x3a9296944364 code target (BUILTIN) (0x3a92969659c0)
0x3a9296944378 external reference (Runtime::CompileOptimized_NotConcurrent) (0x1013e4590)
0x3a9296944381 code target (STUB) (0x3a9296884740)
0x3a92969443a1 code target (BUILTIN) (0x3a92969659c0)
0x3a92969443c0 code target (BUILTIN) (0x3a92969659c0)
0x3a92969443dc embedded object (0x3a9296944281 <Code BUILTIN>)
0x3a92969443e8 embedded object (0x1ca5d7b822e1 <undefined>)
0x3a9296944401 code target (BUILTIN) (0x3a92969659c0)
0x3a9296944415 external reference (Runtime::CompileOptimized_Concurrent) (0x1013e4000)
0x3a929694441e code target (STUB) (0x3a9296884740)
0x3a929694443e code target (BUILTIN) (0x3a92969659c0)
0x3a929694445d code target (BUILTIN) (0x3a92969659c0)
0x3a929694447c code target (BUILTIN) (0x3a92969659c0)
0x3a92969444c3 code target (BUILTIN) (0x3a92969659c0)
0x3a92969444ee code target (STUB) (0x3a92968a6480)
0x3a9296944528 embedded object (0x3a9296944281 <Code BUILTIN>)
0x3a9296944534 embedded object (0x1ca5d7b822e1 <undefined>)
0x3a929694454d code target (BUILTIN) (0x3a92969659c0)
0x3a9296944561 external reference (Runtime::EvictOptimizedCodeSlot) (0x1013e4b20)
0x3a929694456a code target (STUB) (0x3a9296884740)
0x3a929694458a code target (BUILTIN) (0x3a92969659c0)
0x3a92969445c5 code target (BUILTIN) (0x3a92969659c0)
0x3a92969445df code target (BUILTIN) (0x3a92969659c0)
0x3a9296944611 external reference (Runtime::ThrowStackOverflow) (0x101426c30)
0x3a929694461a code target (STUB) (0x3a9296884740)
0x3a9296944641 external reference (Interpreter::dispatch_table_address) (0x106046210)
0x3a929694466c external reference (Bytecodes::bytecode_size_table_address) (0x101db22a0)
Notice that the name InterpreterEntryTrampoline
.
Heap objects
How are js_entry_code() set? For this we have to look in src/heap/heap.h:
V(Code, js_entry_code, JsEntryCode)
and in src/factory-inl.h
we have:
#define ROOT_ACCESSOR(type, name, camel_name) \
Handle<type> Factory::name() { \
return Handle<type>(bit_cast<type**>( \
&isolate()->heap()->roots_[Heap::k##camel_name##RootIndex])); \
}
ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR
So this will expand to:
Handle<Code> Factory::js_entry_code() {
return Handle<Code>(bit_cast<type**>(&isolate()->heap()->roots_[Heap::kJsEntryCodeRootIndex])); \
We can verify what is returned is we have access to an isolate using:
(lldb) expr isolate->heap()->roots_[v8::internal::Heap::RootListIndex::kJsEntryCodeRootIndex]
(v8::internal::Object *) $1069 = 0x0000165740a04001
And we can print the code using:
(lldb) job isolate->heap()->roots_[v8::internal::Heap::RootListIndex::kJsEntryCodeRootIndex]
This is sort of interesting that we can access any or the root objects using the above method. For example, if
we look in heap.h and the STRONG_ROOT_LIST
we should be able to inspect any. For example, we could look at the
TrueValue
:
(lldb) job isolate->heap()->roots_[v8::internal::Heap::RootListIndex::kTrueValueRootIndex]
#true
$ size -x -l -m out/Debug/node Segment __PAGEZERO: 0x100000000 (vmaddr 0x0 fileoff 0) Segment __TEXT: 0x299f000 (vmaddr 0x100000000 fileoff 0) Section __text: 0x1c1feca (addr 0x100000d00 offset 3328) Section __stubs: 0x14ee (addr 0x101c20bca offset 29494218) Section __stub_helper: 0xe84 (addr 0x101c220b8 offset 29499576) Section __const: 0x821f20 (addr 0x101c23000 offset 29503488) Section __cstring: 0x147a18 (addr 0x102444f20 offset 38031136) Section __ustring: 0x942 (addr 0x10258c938 offset 39373112) Section __dof_node: 0x89e (addr 0x10258d27a offset 39375482) Section __unwind_info: 0x6ef8 (addr 0x10258db18 offset 39377688) Section __eh_frame: 0x409f10 (addr 0x102594a10 offset 39406096) total 0x299db5c Segment __DATA: 0xbf000 (vmaddr 0x10299f000 fileoff 43642880) Section __program_vars: 0x28 (addr 0x10299f000 offset 43642880) Section __nl_symbol_ptr: 0x10 (addr 0x10299f028 offset 43642920) Section __got: 0x4350 (addr 0x10299f038 offset 43642936) Section __la_symbol_ptr: 0x1be8 (addr 0x1029a3388 offset 43660168) Section __mod_init_func: 0x50 (addr 0x1029a4f70 offset 43667312) Section __mod_term_func: 0x10 (addr 0x1029a4fc0 offset 43667392) Section __const: 0x70fb0 (addr 0x1029a4fd0 offset 43667408) Section __data: 0x34700 (addr 0x102a15f80 offset 44130176) Section __thread_vars: 0x18 (addr 0x102a4a680 offset 44344960) Section __thread_bss: 0x4 (addr 0x102a4a698 offset 0) Section __common: 0x14d8 (addr 0x102a4a6a0 offset 0) Section __bss: 0x12447 (addr 0x102a4bb80 offset 0) total 0xbefbb Segment __LINKEDIT: 0x1bc7000 (vmaddr 0x102a5e000 fileoff 44347392) total 0x104625000: pushq (%r10)
Context
JavaScript provides a set of builtin functions and objects. These functions and objects can be changed by user code. Each context is separate collection of these objects and functions.
And internal::Context is declared in deps/v8/src/contexts.h
and extends FixedArray
class Context: public FixedArray {
(lldb) br s -f node.cc -l 4439
(lldb) expr context->length()
(int) $522 = 281
This output was taken
Creating a new Context is done by v8::CreateEnvironment
(lldb) br s -f api.cc -l 6565
InvokeBootstrapper<ObjectType> invoke;
6635 result =
-> 6636 invoke.Invoke(isolate, maybe_proxy, proxy_template, extensions,
6637 context_snapshot_index, embedder_fields_deserializer);
This will later end up in Snapshot::NewContextFromSnapshot
:
Vector<const byte> context_data =
ExtractContextData(blob, static_cast<uint32_t>(context_index));
SnapshotData snapshot_data(context_data);
MaybeHandle<Context> maybe_result = PartialDeserializer::DeserializeContext(
isolate, &snapshot_data, can_rehash, global_proxy,
embedder_fields_deserializer);
So we can see here that the Context is deserialized from the snapshot. What does the Context contain at this stage:
(lldb) expr result->length()
(int) $650 = 281
(lldb) expr result->Print()
Lets take a look at an entry:
(lldb) expr result->get(0)->Print()
0xc201584331: [Function] in OldSpace
- map = 0xc24c002251 [FastProperties]
- prototype = 0xc201584371
- elements = 0xc2b2882251 <FixedArray[0]> [HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0xc2b2887521 <SharedFunctionInfo>
- name = 0xc2b2882441 <String[0]: >
- formal_parameter_count = -1
- kind = [ NormalFunction ]
- context = 0xc201583a59 <FixedArray[281]>
- code = 0x2df1f9865a61 <Code BUILTIN>
- source code = () {}
- properties = 0xc2b2882251 <FixedArray[0]> {
#length: 0xc2cca83729 <AccessorInfo> (const accessor descriptor)
#name: 0xc2cca83799 <AccessorInfo> (const accessor descriptor)
#arguments: 0xc201587fd1 <AccessorPair> (const accessor descriptor)
#caller: 0xc201587fd1 <AccessorPair> (const accessor descriptor)
#constructor: 0xc201584c29 <JSFunction Function (sfi = 0xc2b28a6fb1)> (const data descriptor)
#apply: 0xc201588079 <JSFunction apply (sfi = 0xc2b28a7051)> (const data descriptor)
#bind: 0xc2015880b9 <JSFunction bind (sfi = 0xc2b28a70f1)> (const data descriptor)
#call: 0xc2015880f9 <JSFunction call (sfi = 0xc2b28a7191)> (const data descriptor)
#toString: 0xc201588139 <JSFunction toString (sfi = 0xc2b28a7231)> (const data descriptor)
0xc2b28bc669 <Symbol: Symbol.hasInstance>: 0xc201588179 <JSFunction [Symbol.hasInstance] (sfi = 0xc2b28a72d1)> (const data descriptor)
}
- feedback vector: not available
So we can see that this is of type [Function]
which we can cast using:
(lldb) expr JSFunction::cast(result->get(0))->code()->Print()
0x2df1f9865a61: [Code]
kind = BUILTIN
name = EmptyFunction
(lldb) expr result->previous()
(v8::internal::Context *) $657 = 0x0000000000000000
(lldb) expr JSFunction::cast(result->closure())->Print()
0xc201584331: [Function] in OldSpace
- map = 0xc24c002251 [FastProperties]
- prototype = 0xc201584371
- elements = 0xc2b2882251 <FixedArray[0]> [HOLEY_ELEMENTS]
- initial_map =
- shared_info = 0xc2b2887521 <SharedFunctionInfo>
- name = 0xc2b2882441 <String[0]: >
- formal_parameter_count = -1
- kind = [ NormalFunction ]
- context = 0xc201583a59 <FixedArray[281]>
- code = 0x2df1f9865a61 <Code BUILTIN>
- source code = () {}
- properties = 0xc2b2882251 <FixedArray[0]> {
#length: 0xc2cca83729 <AccessorInfo> (const accessor descriptor)
#name: 0xc2cca83799 <AccessorInfo> (const accessor descriptor)
#arguments: 0xc201587fd1 <AccessorPair> (const accessor descriptor)
#caller: 0xc201587fd1 <AccessorPair> (const accessor descriptor)
#constructor: 0xc201584c29 <JSFunction Function (sfi = 0xc2b28a6fb1)> (const data descriptor)
#apply: 0xc201588079 <JSFunction apply (sfi = 0xc2b28a7051)> (const data descriptor)
#bind: 0xc2015880b9 <JSFunction bind (sfi = 0xc2b28a70f1)> (const data descriptor)
#call: 0xc2015880f9 <JSFunction call (sfi = 0xc2b28a7191)> (const data descriptor)
#toString: 0xc201588139 <JSFunction toString (sfi = 0xc2b28a7231)> (const data descriptor)
0xc2b28bc669 <Symbol: Symbol.hasInstance>: 0xc201588179 <JSFunction [Symbol.hasInstance] (sfi = 0xc2b28a72d1)> (const data descriptor)
}
- feedback vector: not available
So this is the JSFunction associated with the deserialized context. Not sure what this is about as looking at the source code it looks like
an empty function. A function can also be set on the context so I'm guessing that this give access to the function of a context once set.
Where is function set, well it is probably deserialized but we can see it be used in deps/v8/src/bootstrapper.cc
:
{
Handle<JSFunction> function = SimpleCreateFunction(isolate, factory->empty_string(), Builtins::kAsyncFunctionAwaitCaught, 2, false);
native_context->set_async_function_await_caught(*function);
}
```console
(lldb) expr isolate()->builtins()->builtin_handle(Builtins::Name::kAsyncFunctionAwaitCaught)->Print()
This context also has extensions:
(lldb) expr result->extension()->Print()
Context::Scope
is a RAII class used to Enter/Exit a context. Lets take a closer look at Enter
:
void Context::Enter() {
i::Handle<i::Context> env = Utils::OpenHandle(this);
i::Isolate* isolate = env->GetIsolate();
ENTER_V8_NO_SCRIPT_NO_EXCEPTION(isolate);
i::HandleScopeImplementer* impl = isolate->handle_scope_implementer();
impl->EnterContext(env);
impl->SaveContext(isolate->context());
isolate->set_context(*env);
}
So the current context is saved and then the this context env
is set as the current on the isolate.
EnterContext
will push the passed-in context (deps/v8/src/api.cc):
void HandleScopeImplementer::EnterContext(Handle<Context> context) {
entered_contexts_.push_back(*context);
}
...
DetachableVector<Context*> entered_contexts_;
DetachableVector is a delegate/adaptor with some additonaly features on a std::vector. Handle<Context> context1 = NewContext(isolate); Handle<Context> context2 = NewContext(isolate); Context::Scope context_scope1(context1); // entered_contexts_ [context1], saved_contexts_[isolateContext] Context::Scope context_scope2(context2); // entered_contexts_ [context1, context2], saved_contexts[isolateContext, context1]
Now, SaveContext
is using the current context, not this
context (env
) and pushing that to the end of the saved_contexts_ vector.
We can look at this as we entered context_scope2 from context_scope1:
And Exit
looks like:
void Context::Exit() {
i::Handle<i::Context> env = Utils::OpenHandle(this);
i::Isolate* isolate = env->GetIsolate();
ENTER_V8_NO_SCRIPT_NO_EXCEPTION(isolate);
i::HandleScopeImplementer* impl = isolate->handle_scope_implementer();
if (!Utils::ApiCheck(impl->LastEnteredContextWas(env),
"v8::Context::Exit()",
"Cannot exit non-entered context")) {
return;
}
impl->LeaveContext();
isolate->set_context(impl->RestoreContext());
}
Now the above context was the internal context, but user programs will use context from deps/v8/include/v8.h
:
(lldb) expr context->Global()
(v8::Local<v8::Object>) $29 = (val_ = 0x000000010584e7e8)
When calling a function, for example:
Local<Value> result = script.ToLocalChecked()->Run();
Every HeapObject has a GetIsolate function that returns the isolate associated with the object. This means that we can get the CurrentContext by using this isolate.
What is a native_context, as returned from context->native_context()?
Lets take a closer look at Isolate::GetCurrentContext
:
v8::Local<v8::Context> Isolate::GetCurrentContext() {
i::Isolate* isolate = reinterpret_cast<i::Isolate*>(this);
i::Context* context = isolate->context();
if (context == NULL) return Local<Context>();
i::Context* native_context = context->native_context();
if (native_context == NULL) return Local<Context>();
return Utils::ToLocal(i::Handle<i::Context>(native_context));
}
Notice that the context retrieved from the isolate is only used to retrieve the native context and return it.
At least the first time these might be the same.
deps/v8/src/contexts-inl.h
has the definition for native_context()
:
Context* Context::native_context() const {
Object* result = get(NATIVE_CONTEXT_INDEX);
DCHECK(IsBootstrappingOrNativeContext(this->GetIsolate(), result));
return reinterpret_cast<Context*>(result);
}
So notice the get(NATIVE_CONTEXT_INDEX). Remember that the internal Context extends FixedArray
so get
above is get
from FixedArray
:
inline Object* get(int index) const;
So a context is bacially an array objects and the indexes are defined in Field
enum:
(lldb) expr context->get(Field::NATIVE_CONTEXT_INDEX)
(v8::internal::Object *) $81 = 0x000028ece3a83a59
(lldb) expr context->get(3)
(v8::internal::Object *) $82 = 0x000028ece3a83a59
(lldb) expr context->Print()
0x28ece3a83a59: [FixedArray] in OldSpace
- map = 0x28ec3aa02bb1 <Map(HOLEY_ELEMENTS)>
- length: 281
0: 0x28ece3a84331 <JSFunction (sfi = 0x28ecf1b87521)>
1: 0
2: 0x28ece3aa3309 <JSObject>
3: 0x28ece3a83a59 <FixedArray[281]>
(lldb) expr context->get(Field::NATIVE_CONTEXT_INDEX)->Print()
0x28ece3a83a59: [FixedArray] in OldSpace
- map = 0x28ec3aa02bb1 <Map(HOLEY_ELEMENTS)>
- length: 281
0: 0x28ece3a84331 <JSFunction (sfi = 0x28ecf1b87521)>
1: 0
2: 0x28ece3aa3309 <JSObject>
3: 0x28ece3a83a59 <FixedArray[281]>
4: 0x28ec0e082239 <JSGlobal Object>
So from the above we can see that context
is an array and index Field::NATIVE_CONTEXT_INDEX
is also a FixedArray.
So how does v8::internal::Context
relate to v8::Context
(deps/v8/include/v8.h
):
Let's take a closer look at this function:
```c++
v8::Local<v8::Context> Isolate::GetCurrentContext() {
i::Isolate* isolate = reinterpret_cast<i::Isolate*>(this);
i::Context* context = isolate->context();
if (context == NULL) return Local<Context>();
i::Context* native_context = context->native_context();
if (native_context == NULL) return Local<Context>();
return Utils::ToLocal(i::Handle<i::Context>(native_context));
}
ToLocal
is defined using a macro in deps/v8/src/api.h
:
#define MAKE_TO_LOCAL(Name, From, To) \
Local<v8::To> Utils::Name(v8::internal::Handle<v8::internal::From> obj) { \
return Convert<v8::internal::From, v8::To>(obj); \
}
MAKE_TO_LOCAL(ToLocal, Context, Context)
So this would be expanded by the preprocessor to this:
Local<v8::Context> Utils::ToLocal(v8::internal::Handle<v8::internal::Context> obj) { \
return Convert<v8::internal::Context, v8::Context>(obj); \
}
// instantiated template impl would be something like this:
static inline Local<Context> Convert(v8::internal::Handle<From> obj) {
DCHECK(obj.is_null() ||
(obj->IsSmi() ||
!obj->IsTheHole(i::HeapObject::cast(*obj)->GetIsolate())));
return Local<Context>(reinterpret_cast<Context*>(obj.location()));
}
So, a v8::internal::Context
can be casted to be of type v8::Context
.
(lldb) expr context->get(Field::NATIVE_CONTEXT_INDEX)->Print()
(lldb) expr context->Global()->Set(key, value)
In node a context can be created using NewContext
in node.cc
.
auto context = Context::New(isolate, nullptr, object_template);
Which will call (deps/v8/src/api.cc):
Local<Context> v8::Context::New(
v8::Isolate* external_isolate, v8::ExtensionConfiguration* extensions,
v8::MaybeLocal<ObjectTemplate> global_template,
v8::MaybeLocal<Value> global_object,
DeserializeInternalFieldsCallback internal_fields_deserializer) {
return NewContext(external_isolate,
extensions,
global_template,
global_object,
0,
internal_fields_deserializer);
}
The declaration for this function can be found in deps/v8/include/v8.h
:
Local<Context> NewContext(
v8::Isolate* external_isolate,
v8::ExtensionConfiguration* extensions,
v8::MaybeLocal<ObjectTemplate> global_template,
v8::MaybeLocal<Value> global_object,
size_t context_snapshot_index,
v8::DeserializeInternalFieldsCallback embedder_fields_deserializer) {
...
i::Handle<i::Context> env = CreateEnvironment<i::Context>(
isolate, extensions, global_template, global_object,
context_snapshot_index, embedder_fields_deserializer);
}
CreateEnvironment
:
// Create the environment.
InvokeBootstrapper<ObjectType> invoke;
result = invoke.Invoke(isolate, maybe_proxy, proxy_template, extensions,
context_snapshot_index, embedder_fields_deserializer);
template <>
struct InvokeBootstrapper<i::Context> {
i::Handle<i::Context> Invoke(
i::Isolate* isolate, i::MaybeHandle<i::JSGlobalProxy> maybe_global_proxy,
v8::Local<v8::ObjectTemplate> global_proxy_template,
v8::ExtensionConfiguration* extensions, size_t context_snapshot_index,
v8::DeserializeInternalFieldsCallback embedder_fields_deserializer) {
return isolate->bootstrapper()->CreateEnvironment(
maybe_global_proxy, global_proxy_template, extensions,
context_snapshot_index, embedder_fields_deserializer);
}
};
'CreateEnvironmentin
boostrapper.cc`:
Genesis genesis(isolate_, maybe_global_proxy, global_proxy_template,
context_snapshot_index, embedder_fields_deserializer,
context_type);
SaveContext saved_context(isolate);
SaveContext::SaveContext
in deps/v8/src/isolate.cc
:
SaveContext::SaveContext(Isolate* isolate)
: isolate_(isolate), prev_(isolate->save_context()) {
if (isolate->context() != nullptr) {
context_ = Handle<Context>(isolate->context());
}
isolate->set_save_context(this);
c_entry_fp_ = isolate->c_entry_fp(isolate->thread_local_top());
}
global_proxy = isolate->factory()->NewUninitializedJSGlobalProxy(instance_size);
Factory::NewUninitializedJSGlobalProxy
in `deps/v8/src/factory.cc:
Handle<JSGlobalProxy> Factory::NewUninitializedJSGlobalProxy(int size) {
// Create an empty shell of a JSGlobalProxy that needs to be reinitialized
// via ReinitializeJSGlobalProxy later.
Handle<Map> map = NewMap(JS_GLOBAL_PROXY_TYPE, size);
// Maintain invariant expected from any JSGlobalProxy.
map->set_is_access_check_needed(true);
map->set_may_have_interesting_symbols(true);
CALL_HEAP_FUNCTION(
isolate(), isolate()->heap()->AllocateJSObjectFromMap(*map, NOT_TENURED),
JSGlobalProxy);
}
(lldb) expr isolate()->heap()->AllocateJSObjectFromMap(*map, static_cast<PretenureFlag>(NOT_TENURED) , nullptr)
(v8::internal::AllocationResult) $10 = (object_ = 0x00002ef850802239)
CALL_HEAP_FUNCTION
macro:
#define RETURN_OBJECT_UNLESS_RETRY(ISOLATE, TYPE) \
if (__allocation__.To(&__object__)) { \
DCHECK(__object__ != (ISOLATE)->heap()->exception()); \
return Handle<TYPE>(TYPE::cast(__object__), ISOLATE); \
}
#define CALL_HEAP_FUNCTION(ISOLATE, FUNCTION_CALL, TYPE)
AllocationResult __allocation__ = FUNCTION_CALL;
Object* __object__ = nullptr;
if (__allocation__.To(&__object__)) { // expanded macro RETURN_OBJECT_UNLESS_RETRY(ISOLATE, TYPE)
DCHECK(__object__ != isolate->heap()->exception());
return Handle<JSGlobalProxy>(JSGlobalProxy::cast(__object__), isolate);
}
/* Two GCs before panicking. In newspace will almost always succeed. */
for (int __i__ = 0; __i__ < 2; __i__++) {
isolate->heap()->CollectGarbage(__allocation__.RetrySpace(), GarbageCollectionReason::kAllocationFailure);
__allocation__ = FUNCTION_CALL;
if (__allocation__.To(&__object__)) { // expanded macro RETURN_OBJECT_UNLESS_RETRY(ISOLATE, TYPE)
DCHECK(__object__ != isolate->heap()->exception());
return Handle<JSGlobalProxy>(JSGlobalProxy::cast(__object__), isolate);
}
}
isolate->counters()->gc_last_resort_from_handles()->Increment();
isolate->heap()->CollectAllAvailableGarbage(GarbageCollectionReason::kLastResort);
{
AlwaysAllocateScope __scope__(isolate);
__allocation__ = FUNCTION_CALL;
}
if (__allocation__.To(&__object__)) { // expanded macro RETURN_OBJECT_UNLESS_RETRY(ISOLATE, TYPE)
DCHECK(__object__ != isolate->heap()->exception());
return Handle<JSGlobalProxy>(JSGlobalProxy::cast(__object__), isolate);
}
/* TODO(1181417): Fix this. */
v8::internal::Heap::FatalProcessOutOfMemory("CALL_AND_RETRY_LAST", true);
return Handle<TYPE>();
(lldb) expr JSGlobalProxy::cast($10.object_)
(v8::internal::JSGlobalProxy *) $12 = 0x00002ef850802239
JSGlobalProxy
can be found in `deps/v8/src/objects'
Back in deps/v8/src/bootstrapper.cc
deps/v8/src/bootstrapper.cc
if (!isolate->initialized_from_snapshot() ||
!Snapshot::NewContextFromSnapshot(isolate, global_proxy,
context_snapshot_index,
embedder_fields_deserializer)
.ToHandle(&native_context_)) {
native_context_ = Handle<Context>();
deps/v8/src/snapshot/snapshot-common.cc
:
Vector<const byte> context_data = ExtractContextData(blob, static_cast<uint32_t>(context_index));
SnapshotData snapshot_data(context_data);
MaybeHandle<Context> maybe_result = PartialDeserializer::DeserializeContext(
isolate, &snapshot_data, can_rehash, global_proxy,
embedder_fields_deserializer);
(lldb) p context_index
(size_t) $42 = 0
PartialDeserializer::DeserializeContext
:
Again I'm asking myself what is in the context:
(lldb) p result->Print()
(lldb) expr JSFunction::cast(result->get(0))->code()->Print()
0x38873dd44621: [Code]
kind = BUILTIN
name = EmptyFunction
compiler = unknown
address = 0x38873dd44621
Instructions (size = 15)
0x38873dd44680 0 48bb50fe690001000000 REX.W movq rbx,0x10069fe50 ;; external reference (Builtin_EmptyFunction)
0x38873dd4468a a e9b1abfcff jmp 0x38873dd0f240 (AdaptorWithBuiltinExitFrame) ;; code: BUILTIN
locInfo (size = 3)
0x38873dd44682 external reference (Builtin_EmptyFunction) (0x10069fe50)
0x38873dd4468b code target (BUILTIN) (0x38873dd0f240)
(lldb) expr result->get(1)
(v8::internal::Object *) $91 = 0x0000000000000000
The third entry in the array are the [global properties](https://tc39.github.io/ecma262/#sec-value-properties-of-the-global-object):
(lldb) expr JSGlobalObject::cast(result->get(2))
(v8::internal::JSGlobalObject *) $93 = 0x00002ef8f1387e21
(lldb) expr JSGlobalObject::cast(result->get(2))->Print()
0x2ef8f1387e21: [JSGlobalObject] in OldSpace
- map = 0x2ef8f2c029d1 [DictionaryProperties]
- prototype = 0x2ef8f1387e49
- elements = 0x2ef8bb802251 <FixedArray[0]> [HOLEY_ELEMENTS]
- global proxy = 0x2ef850802259 <JSGlobal Object>
- properties = 0x2ef8f1387ef9 <HashTable[133]> {
#Promise: 0x2ef8f1393a71 <JSFunction Promise (sfi = 0x2ef8bb828829)> (data, dict_index: 28, attrs: [W_C])
#eval: 0x2ef8f1396a59 <JSFunction eval (sfi = 0x2ef8bb836bb1)> (data, dict_index: 100, attrs: [W_C])
#ArrayBuffer: 0x2ef8f1389551 <JSFunction ArrayBuffer (sfi = 0x2ef8bb82f4f1)> (data, dict_index: 54, attrs: [W_C])
#Map: 0x2ef8f1395689 <JSFunction Map (sfi = 0x2ef8bb832691)> (data, dict_index: 76, attrs: [W_C])
#parseInt: 0x2ef8f1390371 <JSFunction parseInt (sfi = 0x2ef8bb823c71)> (data, dict_index: 12, attrs: [W_C])
#Uint8ClampedArray: 0x2ef8f138b2c1 <JSFunction Uint8ClampedArray (sfi = 0x2ef8bb80d671)> (data, dict_index: 72, attrs: [W_C])
#unescape: 0x2ef8f138c029 <JSFunction unescape (sfi = 0x2ef8bb836af9)> (data, dict_index: 98, attrs: [W_C])
#RegExp: 0x2ef8f13906e1 <JSFunction RegExp (sfi = 0x2ef8bb829351)> (data, dict_index: 30, attrs: [W_C])
#Uint16Array: 0x2ef8f138a759 <JSFunction Uint16Array (sfi = 0x2ef8bb80c3b1)> (data, dict_index: 60, attrs: [W_C])
#Error: 0x2ef8f138c0c9 <JSFunction Error (sfi = 0x2ef8bb82bcc9)> (data, dict_index: 32, attrs: [W_C])
#undefined: 0x2ef8bb8022e1 <undefined> (data, dict_index: 18, attrs: [___])
#Int32Array: 0x2ef8f138abe9 <JSFunction Int32Array (sfi = 0x2ef8bb80cd11)> (data, dict_index: 66, attrs: [W_C])
#Uint32Array: 0x2ef8f138a9a1 <JSFunction Uint32Array (sfi = 0x2ef8bb80c9f1)> (data, dict_index: 64, attrs: [W_C])
#Function: 0x2ef8f1384af9 <JSFunction Function (sfi = 0x2ef8bb821a29)> (data, dict_index: 4, attrs: [W_C])
#ReferenceError: 0x2ef8f1395419 <JSFunction ReferenceError (sfi = 0x2ef8bb82c191)> (data, dict_index: 38, attrs: [W_C])
#TypeError: 0x2ef8f138c569 <JSFunction TypeError (sfi = 0x2ef8bb82c3a9)> (data, dict_index: 42, attrs: [W_C])
#Float32Array: 0x2ef8f138ae31 <JSFunction Float32Array (sfi = 0x2ef8bb80d031)> (data, dict_index: 68, attrs: [W_C])
#encodeURI: 0x2ef8f1397cf1 <JSFunction encodeURI (sfi = 0x2ef8bb8368b9)> (data, dict_index: 92, attrs: [W_C])
#Intl: 0x2ef8f1397bc1 <Object map = 0x2ef8f2c04d21> (data, dict_index: 52, attrs: [W_C])
#JSON: 0x2ef8f1388359 <Object map = 0x2ef8f2c02ac1> (data, dict_index: 46, attrs: [W_C])
#Uint8Array: 0x2ef8f1389869 <JSFunction Uint8Array (sfi = 0x2ef8bb80bce9)> (data, dict_index: 56, attrs: [W_C])
#Date: 0x2ef8f138c7b1 <JSFunction Date (sfi = 0x2ef8bb826469)> (data, dict_index: 26, attrs: [W_C])
#Boolean: 0x2ef8f1397549 <JSFunction Boolean (sfi = 0x2ef8bb823d29)> (data, dict_index: 20, attrs: [W_C])
#WeakMap: 0x2ef8f1397829 <JSFunction WeakMap (sfi = 0x2ef8bb8334e1)> (data, dict_index: 80, attrs: [W_C])
#Math: 0x2ef8f1392b81 <Object map = 0x2ef8f2c04001> (data, dict_index: 48, attrs: [W_C])
#String: 0x2ef8f1393e89 <JSFunction String (sfi = 0x2ef8bb823f11)> (data, dict_index: 22, attrs: [W_C])
#escape: 0x2ef8f13977c9 <JSFunction escape (sfi = 0x2ef8bb836a41)> (data, dict_index: 96, attrs: [W_C])
#NaN: 0x2ef8bb802311 <Number nan> (data, dict_index: 16, attrs: [___])
#isFinite: 0x2ef8f1395ee1 <JSFunction isFinite (sfi = 0x2ef8bb836c69)> (data, dict_index: 102, attrs: [W_C])
#Infinity: 0x2ef8bb802a39 <Number inf> (data, dict_index: 14, attrs: [___])
#URIError: 0x2ef8f1395f41 <JSFunction URIError (sfi = 0x2ef8bb82c501)> (data, dict_index: 44, attrs: [W_C])
#Array: 0x2ef8f1384ab9 <JSFunction Array (sfi = 0x2ef8bb8223c1)> (data, dict_index: 6, attrs: [W_C])
#Int8Array: 0x2ef8f138a511 <JSFunction Int8Array (sfi = 0x2ef8bb80c091)> (data, dict_index: 58, attrs: [W_C])
#encodeURIComponent: 0x2ef8f1395e81 <JSFunction encodeURIComponent (sfi = 0x2ef8bb836979)> (data, dict_index: 94, attrs: [W_C])
#Float64Array: 0x2ef8f138b079 <JSFunction Float64Array (sfi = 0x2ef8bb80d351)> (data, dict_index: 70, attrs: [W_C])
#RangeError: 0x2ef8f1395c39 <JSFunction RangeError (sfi = 0x2ef8bb82c039)> (data, dict_index: 36, attrs: [W_C])
#console: 0x2ef8f1397d51 <console map = 0x2ef8f2c04d71> (data, dict_index: 50, attrs: [W_C])
#SyntaxError: 0x2ef8f138c089 <JSFunction SyntaxError (sfi = 0x2ef8bb82c251)> (data, dict_index: 40, attrs: [W_C])
#Symbol: 0x2ef8f1394f81 <JSFunction Symbol (sfi = 0x2ef8bb826049)> (data, dict_index: 24, attrs: [W_C])
#parseFloat: 0x2ef8f1390339 <JSFunction parseFloat (sfi = 0x2ef8bb823bb1)> (data, dict_index: 10, attrs: [W_C])
#isNaN: 0x2ef8f1397b39 <JSFunction isNaN (sfi = 0x2ef8bb836d01)> (data, dict_index: 104, attrs: [W_C])
#Number: 0x2ef8f1390049 <JSFunction Number (sfi = 0x2ef8bb8234a1)> (data, dict_index: 8, attrs: [W_C])
#WeakSet: 0x2ef8f1397229 <JSFunction WeakSet (sfi = 0x2ef8bb8337f9)> (data, dict_index: 82, attrs: [W_C])
#decodeURIComponent: 0x2ef8f1394f21 <JSFunction decodeURIComponent (sfi = 0x2ef8bb8367f1)> (data, dict_index: 90, attrs: [W_C])
#decodeURI: 0x2ef8f13974e9 <JSFunction decodeURI (sfi = 0x2ef8bb836731)> (data, dict_index: 88, attrs: [W_C])
#Object: 0x2ef8f1384319 <JSFunction Object (sfi = 0x2ef8bb81f989)> (data, dict_index: 2, attrs: [W_C])
#Reflect: 0x2ef8f13985e9 <Object map = 0x2ef8f2c04e61> (data, dict_index: 86, attrs: [W_C])
#DataView: 0x2ef8f1396229 <JSFunction DataView (sfi = 0x2ef8bb8317d9)> (data, dict_index: 74, attrs: [W_C])
#Set: 0x2ef8f1396b31 <JSFunction Set (sfi = 0x2ef8bb832e81)> (data, dict_index: 78, attrs: [W_C])
#EvalError: 0x2ef8f13937d9 <JSFunction EvalError (sfi = 0x2ef8bb82bf79)> (data, dict_index: 34, attrs: [W_C])
#Int16Array: 0x2ef8f13884b9 <JSFunction Int16Array (sfi = 0x2ef8bb80c6d1)> (data, dict_index: 62, attrs: [W_C])
#Proxy: 0x2ef8f1397031 <JSFunction Proxy (sfi = 0x2ef8bb833a79)> (data, dict_index: 84, attrs: [W_C])
}
(lldb) expr JSGlobalObject::cast(result->get(2))->global_dictionary()
(v8::internal::GlobalDictionary *) $94 = 0x00002ef8f1387ef9
(lldb) expr JSGlobalObject::cast(result->get(2))->global_dictionary()->ValueAt(0)
(v8::internal::Object *) $95 = 0x00002ef8f1393a71
(lldb) expr JSGlobalObject::cast(result->get(2))->global_dictionary()->ValueAt(0)->Print()
0x2ef8f1393a71: [Function] in OldSpace
- map = 0x2ef8f2c04141 [FastProperties]
- prototype = 0x2ef8f13842a9
- elements = 0x2ef8bb802251 <FixedArray[0]> [HOLEY_ELEMENTS]
- function prototype = 0x2ef8f1393d01 <Object map = 0x2ef8f2c041e1>
- initial_map = 0x2ef8f2c04191 <Map(HOLEY_ELEMENTS)>
- shared_info = 0x2ef8bb828829 <SharedFunctionInfo Promise>
- name = 0x2ef8bb8288b1 <String[7]: Promise>
- formal_parameter_count = 1
- kind = [ NormalFunction ]
- context = 0x2ef8f13839c1 <FixedArray[283]>
- code = 0x38873dd0e841 <Code BUILTIN>
- properties = 0x2ef8f1393cd9 <PropertyArray[3]> {
#length: 0x2ef8bb838c81 <AccessorInfo> (const accessor descriptor)
#name: 0x2ef8bb838c11 <AccessorInfo> (const accessor descriptor)
#prototype: 0x2ef8bb838cf1 <AccessorInfo> (const accessor descriptor)
0x2ef8bb838301 <Symbol: (native_context_index_symbol)>: 233 (data field 0) properties[0]
0x2ef8bb8386d9 <Symbol: Symbol.species>: 0x2ef8f1393ba9 <AccessorPair> (const accessor descriptor)
#all: 0x2ef8f1393bf9 <JSFunction all (sfi = 0x2ef8bb8289a9)> (const data descriptor)
#race: 0x2ef8f1393c31 <JSFunction race (sfi = 0x2ef8bb828a61)> (const data descriptor)
#resolve: 0x2ef8f1393c69 <JSFunction resolve (sfi = 0x2ef8bb828b19)> (const data descriptor)
#reject: 0x2ef8f1393ca1 <JSFunction reject (sfi = 0x2ef8bb828bd1)> (const data descriptor)
}
(lldb) expr JSGlobalObject::cast(result->get(2))->global_dictionary()
(v8::internal::GlobalDictionary *) $94 = 0x00002ef8f1387ef9
(lldb) expr result->get(3)->Print()
0x2ef8f13839c1: [FixedArray] in OldSpace
- map = 0x2ef829d82c51 <Map(HOLEY_ELEMENTS)>
- length: 283
0: 0x2ef8f13842a9 <JSFunction (sfi = 0x2ef8bb805519)>
I think this is the native context
(lldb) expr result->get(4)->Print()
0x2ef850802259: [JSGlobalProxy]
- map = 0x2ef8f2c02201 [FastProperties]
- prototype = 0x2ef8bb802201
- elements = 0x2ef8bb802251 <FixedArray[0]> [HOLEY_ELEMENTS]
- properties = 0x2ef8bb802251 <FixedArray[0]> {}
This looks like an empty array.
(lldb) job FixedArray::cast(result->get(5))
0x2ef8f1398a51: [FixedArray] in OldSpace
- map = 0x2ef829d82341 <Map(HOLEY_ELEMENTS)>
- length: 3
0-2: 0x2ef8bb8022e1 <undefined>
This is the embedder data which is a FixedArray.
(lldb) job result->get(6)
0x2ef8f2c04eb1: [Map]
- type: JS_OBJECT_TYPE
- instance size: 56
- inobject properties: 4
- elements kind: HOLEY_ELEMENTS
- unused property fields: 0
- enum length: invalid
- stable_map
- back pointer: 0x2ef8bb8022e1 <undefined>
- instance descriptors (own) #4: 0x2ef8f1398a79 <DescriptorArray[14]>
- layout descriptor: 0x0
- prototype: 0x2ef8f13842e1 <Object map = 0x2ef8f2c022a1>
- constructor: 0x2ef8f1384319 <JSFunction Object (sfi = 0x2ef8bb81f989)>
- dependent code: 0x2ef8bb802251 <FixedArray[0]>
- construction counter: 0
What we are looking at above the the output for result->get(6)->map()->Print(). You can see the source for this
in `deps/v8/src/objects-printer.cc` `Map::MapPrint`.
### GetProperty
```console
expr isolate->factory()->InternalizeUtf8String("something");
(v8::internal::Handle<v8::internal::String>) $112 = {
v8::internal::HandleBase = {
location_ = 0x0000000106002be8
}
}
(lldb) expr v8::internal::Object::GetProperty(v8::internal::Handle<v8::internal::Object>(result->get(6))
ScopeInfo
(lldb) expr PrintBuiltinSizes(this)
TODO: What is ContextInfo in src/env.h?
GYP
If you just want to see what a change in node.gyp does you can make the change and the run:
$ make out/Makefile
Then you can inspect the generated make files in the out directory.
test_node_postmortem issue
(This turned out to have nothing to to with the postmortem code but that is how I ran into this) In the constructor of HandleWrap (src/handle_wrap.cc) we have the following:
handle_->data = this;
HandleScope scope(env->isolate());
Wrap(object, this);
env->handle_wrap_queue()->PushBack(this);
The error I'm looking at is that when env->handle_wrap_queue()->PushBack(this);
is called there is an illegal access as
in PushBack
(src/util-inl.h):
template <typename T, ListNode<T> (T::*M)>
void ListHead<T, M>::PushBack(T* element) {
ListNode<T>* that = &(element->*M);
head_.prev_->next_ = that;
that->prev_ = head_.prev_;
that->next_ = &head_;
head_.prev_ = that;
}
If we inspect head_
we find:
(lldb) expr head_
(node::ListNode<node::HandleWrap>) $89 = (prev_ = 0x0000000000000000, next_ = 0x0000000104900570)
And head_.prev_->next
will dereference a null pointer leading to a EXC_BAD_ACCESS
.
Why is this happening?
When is the handle_wrap_queue created?
An Environment has the following typedef and field:
typedef ListHead<HandleWrap, &HandleWrap::handle_wrap_queue_> HandleWrapQueue;
...
HandleWrapQueue handle_wrap_queue_;
So every Environment has a handle_wrap_queue
list which stores HandleWrap instances and the member in HandleWrap that holds the
node datastructure (of type ListNode) is &HandleWrap::handle_wrap_queue_
:
(lldb) expr &HandleWrap::handle_wrap_queue_
(node::ListNode<node::HandleWrap> node::HandleWrap::*) $0 = 30 00 00 00 00 00 00 00
So when the Environment constructor is called, a ListHead<HandleWrap, &HandleWrap::handle_wrap_queue> instance will be create, by calling its constructor. ListHead can be found in src/util.h and the impl in src/util-inl.h:
inline ListHead() = default;
So ListHead has a default (no-args) constructor generated by the compiler, but ListHead has a member that get initialized too:
ListNode<T> head_;
So lets take a look at the contructor for ListNode
which we can find in src/util-inl.h:
template <typename T>
ListNode<T>::ListNode() : prev_(this), next_(this) {}
We can inspect the ListNode instance before the constructor returns:
(lldb) expr *this
(node::ListNode<node::HandleWrap>) $93 = (prev_ = 0x00000001070034c8, next_ = 0x00000001070034c8)
We can see that both prev_
and next_
have been set to this newly created ListNode (both prev_ and next_ point to itself).
(node::ListNode<node::HandleWrap> *) $22 = 0x00000001070058c8
(lldb) expr *this
(node::ListNode<node::HandleWrap>) $23 = (prev_ = 0x00000001070058c8, next_ = 0x00000001070058c8)
(lldb) expr &env->handle_wrap_queue_
(HandleWrapQueue *) $3 = 0x00000001060060c8
(lldb) expr env->handle_wrap_queue()
(HandleWrapQueue *) $4 = 0x00000001060060b8
The above is from CreateEnvironment. The only difference here is that in the test case I have added:
env->handle_wrap_queue();
Notice that env->handle_wrap_queue() returns 0x00000001050038b8
and env->handle_wrap_queue_ is 0x00000001050038c8
.
And if I remove the statement env->handle_wrap_queue()
from the test the addresses look good again:
(lldb) expr env->handle_wrap_queue_
(node::Environment::HandleWrapQueue) $0 = {
head_ = {
prev_ = 0x00000001060058c8
next_ = 0x00000001060058c8
}
}
(lldb) expr &env->handle_wrap_queue_
(HandleWrapQueue *) $1 = 0x00000001060058c8
(lldb) expr env->handle_wrap_queue()
(HandleWrapQueue *) $2 = 0x00000001060058c8
(lldb) n
(lldb) expr env->handle_wrap_queue()
(HandleWrapQueue *) $3 = 0x00000001060058c8
- Without
env->handle_wrap_queue()
, env->handle_wrap_queue() will return the correct address, but env->handle_wrap_queue_ will still be incorrect. - With
env->handle_wrap_queue()
, env->handle_wrap_queue() will be incorrect, and so will env-handle_wrap_queue_, infact they will be the same. Lets set a break point in the main of cctest:
Lets disassemble `handle_wrap_queue` when we dont have the `env->handle_wrap_queue()` call:
```console
(lldb) dis -n handle_wrap_queue
cctest`node::Environment::handle_wrap_queue:
cctest[0x1018d4be0] <+0>: pushq %rbp
cctest[0x1018d4be1] <+1>: movq %rsp, %rbp
cctest[0x1018d4be4] <+4>: movq %rdi, -0x8(%rbp)
cctest[0x1018d4be8] <+8>: movq -0x8(%rbp), %rdi
cctest[0x1018d4bec] <+12>: addq $0x4c8, %rdi ; imm = 0x4C8
cctest[0x1018d4bf3] <+19>: movq %rdi, %rax
cctest[0x1018d4bf6] <+22>: popq %rbp
cctest[0x1018d4bf7] <+23>: retq
And then with the env->handle_wrap_queue()
call:
(lldb) dis -n handle_wrap_queue
cctest`node::Environment::handle_wrap_queue:
cctest[0x10189e1e0] <+0>: pushq %rbp
cctest[0x10189e1e1] <+1>: movq %rsp, %rbp
cctest[0x10189e1e4] <+4>: movq %rdi, -0x8(%rbp)
cctest[0x10189e1e8] <+8>: movq -0x8(%rbp), %rdi
cctest[0x10189e1ec] <+12>: addq $0x4b8, %rdi ; imm = 0x4B8
cctest[0x10189e1f3] <+19>: movq %rdi, %rax
cctest[0x10189e1f6] <+22>: popq %rbp
cctest[0x10189e1f7] <+23>: retq
We can find a generated call of handle_wrap_queue() in obj.target/node_lib/node.o:
$ otool -tvV out/Debug/obj.target/node_lib/src/node.o:
__ZN4node11Environment17handle_wrap_queueEv:
0000000000005950 pushq %rbp
0000000000005951 movq %rsp, %rbp
0000000000005954 movq %rdi, -0x8(%rbp)
0000000000005958 movq -0x8(%rbp), %rdi
000000000000595c addq $0x4c8, %rdi
0000000000005963 movq %rdi, %rax
0000000000005966 popq %rbp
0000000000005967 retq
$ otool -tvV out/Debug/obj.target/cctest/test/cctest/test_node_postmortem_metadata.o:
__ZN4node11Environment17handle_wrap_queueEv:
0000000000000920 pushq %rbp
0000000000000921 movq %rsp, %rbp
0000000000000924 movq %rdi, -0x8(%rbp)
0000000000000928 movq -0x8(%rbp), %rdi
000000000000092c addq $0x4b8, %rdi
0000000000000933 movq %rdi, %rax
0000000000000936 popq %rbp
0000000000000937 retq
The reason for this difference is because there was a missing macro defined which when compiling the cctest target. This is what caused the incorrect offset to be added and returned.
node_postmortem_metadata
#define NODEDBG_SYMBOL(Name) nodedbg_ ## Name
// nodedbg_offset_CLASS__MEMBER__TYPE: Describes the offset to a class member.
#define NODEDBG_OFFSET(Class, Member, Type) \
NODEDBG_SYMBOL(offset_ ## Class ## __ ## Member ## __ ## Type)
// These are the constants describing Node internal structures. Every constant
// should use the format described above. These constants are declared as
// global integers so that they'll be present in the generated node binary. They
// also need to be declared outside any namespace to avoid C++ name-mangling.
#define NODE_OFFSET_POSTMORTEM_METADATA(V) \
V(BaseObject, persistent_handle_, v8_Persistent_v8_Object, \
BaseObject::persistent_handle_) \
V(Environment, handle_wrap_queue_, Environment_HandleWrapQueue, \
Environment::handle_wrap_queue_) \
V(Environment, req_wrap_queue_, Environment_ReqWrapQueue, \
Environment::req_wrap_queue_) \
V(HandleWrap, handle_wrap_queue_, ListNode_HandleWrap, \
HandleWrap::handle_wrap_queue_) \
V(Environment_HandleWrapQueue, head_, ListNode_HandleWrap, \
Environment::HandleWrapQueue::head_) \
V(ListNode_HandleWrap, next_, uintptr_t, ListNode<HandleWrap>::next_) \
V(ReqWrap, req_wrap_queue_, ListNode_ReqWrapQueue, \
ReqWrap<uv_req_t>::req_wrap_queue_) \
V(Environment_ReqWrapQueue, head_, ListNode_ReqWrapQueue, \
Environment::ReqWrapQueue::head_) \
V(ListNode_ReqWrap, next_, uintptr_t, ListNode<ReqWrap<uv_req_t>>::next_)
extern "C" {
int nodedbg_const_Environment__kContextEmbedderDataIndex__int;
uintptr_t nodedbg_offset_ExternalString__data__uintptr_t;
#define V(Class, Member, Type, Accessor) \
NODE_EXTERN uintptr_t NODEDBG_OFFSET(Class, Member, Type);
NODE_OFFSET_POSTMORTEM_METADATA(V)
#undef V
}
So lets expand one, for example Environment_HandleWrapQueue
:
V(Environment, handle_wrap_queue_, Environment_HandleWrapQueue, \
Environment::handle_wrap_queue_) \
NODE_EXTERN uintptr_t nodedbg_offset_Environment__handle_wrap_queue___Environment_HandleWrapQueue;
(lldb) expr nodedbg_offset_Environment__handle_wrap_queue___Environment_HandleWrapQueue
(uintptr_t) $7 = 1224
nodedbg_offset_Environment__handle_wrap_queue___Environment_HandleWrapQueue = OffsetOf(Environment::handle_wrap_queue_);
When int GenDebugSymbols()
is called will call:
(lldb) br s -f node_postmortem_metadata.cc -l 103
(lldb) expr nodedbg_offset_Environment__handle_wrap_queue___Environment_HandleWrapQueue
(uintptr_t) $19 = 1224
What does OffsetOf(Environment::handle_wrap_queue_)
actually do?
template <typename Inner, typename Outer>
constexpr uintptr_t OffsetOf(Inner Outer::*field) {
return reinterpret_cast<uintptr_t>(&(static_cast<Outer*>(0)->*field));
}
Note that the parameter field
is a pointer-to-member, which gives the offset of the member within the class object as opposed to using the
address-of operator on a data member bound to an actual class object, which yields the member's actual address in memory.
For example:
(lldb) expr
Enter expressions, then terminate with an empty line to evaluate:
1: class Test {
2: int doit() { return 10; };
3: };
4: Test* t = 0;
5: t->doit();
6:
(int) $1 = 10
Lets take the following call:
nodedbg_offset_Environment__handle_wrap_queue___Environment_HandleWrapQueue = OffsetOf(Environment::handle_wrap_queue_);
I came accross the following pre-processor directive:
#define private friend int GenDebugSymbols(); private
So we are defining private
as an identifier that will resolve friend int GenDebugSymbols(); private
. This comes before a number
of includes in the file. So every private section in classes on those files will be replaced. For example:
private:
will become:
friend int GenDebugSymbols(); private:
Finding called functions
I needed to figure out if a function was being called by Node. The function in question was inet_addr
.
$ cscope -R
Functions calling this function: inet_addr
File Function Line
0 ares__get_hostent.c ares__get_hostent 141 addr.addrV4.s_addr = inet_addr(txtaddr);
1 ares_gethostbyname.c fake_hostent 268 result = ((in.s_addr = inet_addr(name)) ==
INADDR_NONE ? 0 : 1);
2 ares_init.c ip_addr 2325 addr->s_addr = inet_addr(ipbuf);
3 vms_term_sock.c CreateSocketPair 323 sin.sin_addr.s_addr = inet_addr (LocalHostAddr);
4 vms_term_sock.c CreateSocketPair 439 sin.sin_addr.s_addr = inet_addr (LocalHostAddr) ;
5 test.c main 90 addr.sin_addr.s_addr = inet_addr(argv[1]);
6 cli.cpp main 54 sa.sin_addr.s_addr = inet_addr ("127.0.0.1");
Looking at ares_get_hostent
:
Functions calling this function: ares__get_hostent
File Function Line
0 ares_gethostbyaddr.c file_lookup 236 while ((status = ares__get_hostent(fp, addr->family, host))
== ARES_SUCCESS)
1 ares_gethostbyname.c file_lookup 397 while ((status = ares__get_hostent(fp, family, host)) ==
ARES_SUCCESS)
Functions calling this function: file_lookup
File Function Line
0 ares_gethostbyaddr.c next_lookup 120 status = file_lookup(&aquery->addr, &host);
1 ares_gethostbyname.c next_lookup 155 status = file_lookup(hquery->name,
hquery->want_family, &host);
2 ares_gethostbyname.c ares_gethostbyname_file 326 result = file_lookup(name, family, host);
Functions calling this function: ares_gethostbyname_file
File Function Line
0 ares.h defined 407 CARES_EXTERN int ares_gethostbyname_file(ares_channel channel,
This function is not called by node.
Functions calling this function: fake_hostent
File Function Line
0 ares_gethostbyname.c ares_gethostbyname 98 if (fake_hostent(name, family, callback, arg))
Functions calling this function: ares_gethostbyname
File Function Line
0 ares.h defined 401 CARES_EXTERN void ares_gethostbyname(ares_channel channel,
This function not called by node.
Functions calling this function: ip_addr
File Function Line
0 ares_init.c config_sortlist 2116 else if (ip_addr(ipbuf, q-str, &pat.addrV4) == 0)
1 ares_init.c config_sortlist 2122 if (ip_addr(ipbuf, q-str, &pat.mask.addr4) != 0)
Functions calling this function: config_sortlist
File Function Line
0 ares_init.c init_by_resolv_conf 1644 status = config_sortlist(&sortlist, &nsort, p);
1 ares_init.c ares_set_sortlist 2488 status = config_sortlist(&sortlist, &nsort, sortstr);
Functions calling this function: init_by_resolv_conf
File Function Line
0 ares_init.c ares_init_options 206 status = init_by_resolv_conf(channel);
Module Initialization
An addon can have an initializer function that will be called when node::DLOpen
is called. DLOpen is
available on the process object as dlopen
. There is the following code in DLOpen (in node.cc):
if (auto callback = GetInitializerCallback(&dlib)) {
callback(exports, module, context);
} else {
dlib.Close();
}
And if we look at GetInitializerCallback
:
using InitializerCallback = void (*)(Local<Object> exports,
Local<Value> module,
Local<Context> context);
inline InitializerCallback GetInitializerCallback(DLib* dlib) {
const char* name = "node_register_module_v" STRINGIFY(NODE_MODULE_VERSION);
return reinterpret_cast<InitializerCallback>(dlib->GetSymbolAddress(name));
}
we can see that are going to look for a name node_register_module_v
and the value of NODE_MODULE_VERSION
.
(lldb) expr name
(const char *) $21 = 0x00000001026012cc "node_register_module_v61"
The above comes from debugging test/addons/hello-world and the NODE_MODULE_VERSION is 61, which is set in node_version.h. So if the addon declares a function named node_register_module_v61 it will have its initialize function called.
Streams
StreamBase
extends StreamResource which represents a generic stream and has public member functions
like ReadStart, ReadStop, DoShutdown, DoTryWrite, DoWrite, PushStreamListener, RemoveStreamListener
It as a two protected members:
StreamListener* listener_ = nullptr;
uint64_t bytes_read_ = 0;
Lets take a closer look as StreamListener. To understand things lets see how it is used by debugging:
$ lldb -- out/Debug/node --inspect-brk test/parallel/test-fs-read-stream.js
(lldb) br s -f stream_wrap.cc -l 57
If we back up the frame stack a little we can see that this is called from GetBinding with a string value of:
(lldb) up 2
(lldb) jlh module
#stream_wrap
This call originates from lib/internal/boostrap.node.js where we have:
setupGlobalConsole();
...
function setupGlobalConsole() {
const wrappedConsole = NativeModule.require('console');
}
This will trickle down into
module.exports = new Console(process.stdout, process.stderr);
function getStdout() {
if (stdout) return stdout;
stdout = createWritableStdioStream(1);
stdout.destroySoon = stdout.destroy;
stdout._destroy = function(er, cb) {
// Avoid errors if we already emitted
er = er || new ERR_STDOUT_CLOSE();
cb(er);
};
if (stdout.isTTY) {
process.on('SIGWINCH', () => stdout._refreshSize());
}
return stdout;
}
function createWritableStdioStream(fd) {
...
var tty = require('tty');
}
And in lib/tty.js
we have:
const net = require('net');
And in `lib/net.js we have:
process.binding('stream_wrap');
So that is the first time that LibuvStreamWrap::Initialize
is called. How about a function like OnStreamRead
?
(lldb) br s -n OnStreamRead
EmitToJSStreamListener::OnStreamRead
int LibuvStreamWrap::ReadStart() {
return uv_read_start(stream(), [](uv_handle_t* handle,
size_t suggested_size,
uv_buf_t* buf) {
static_cast<LibuvStreamWrap*>(handle->data)->OnUvAlloc(suggested_size, buf);
}, [](uv_stream_t* stream, ssize_t nread, const uv_buf_t* buf) {
static_cast<LibuvStreamWrap*>(stream->data)->OnUvRead(nread, buf);
});
}
Notice that this is calling uv_read_start
and the second argument (the lambda) the allocation callback for libuv, and
the third argument is the read callback. So when ReadStart
called? Well it is set in the Initialize function for
js_stream.cc, node_file.cc, node_http2.cc, and tls_wrap.cc:
With those out of the was, we should stop in the debug console in our test file:
const file = fs.ReadStream(fn);
And in ReadStream we can find:
if (typeof this.fd !== 'number')
this.open();
This will call ReadStream.open
which looks like this:
fs.open(this.path, this.flags, this.mode, function(er, fd) {
if (er) {
if (self.autoClose) {
self.destroy();
}
self.emit('error', er);
return;
}
self.fd = fd;
self.emit('open', fd);
self.emit('ready');
// start the flow of data.
self.read();
});
};
fs.open
can be found in lib/fs.js
and what I think is the important parts are:
const req = new FSReqWrap();
req.oncomplete = callback;
binding.open(pathModule.toNamespacedPath(path),
stringToFlags(flags),
mode,
req);
Notice that new
FSReqWrap` will call the C++ constructor in node_file.h. And also note that the callback is
set on the req.
AsyncDestCall
CHECK_NE(req_wrap, nullptr);
req_wrap->Init(syscall, dest, len, enc);
int err = fn(env->event_loop(), req_wrap->req(), fn_args..., after);
req_wrap->Dispatched();
In our case fn
will be uv_fs_open
.
const req = new FSReqWrap();
req.oncomplete = callback;
binding.open(pathModule.toNamespacedPath(path),
stringToFlags(flags),
mode,
req);
Lets start by takeing a look at classes that extend StreamBase: class JSStream : public AsyncWrap, public StreamBase class FileHandle : public AsyncWrap, public StreamBase class Http2Stream : public AsyncWrap, public StreamBase class StreamPipe : public AsyncWrap class LibuvStreamWrap : public HandleWrap, public StreamBase class TLSWrap : public AsyncWrap, public crypto::SSLWrap<TLSWrap>, public StreamBase, public StreamListener
I noticed that stream_base.h includes req_wrap-inl.h
but as far as I can tell nothing from ReqWrap
is used in StreamBase.
Let's take a look when the StreamReq constructor is called.
$ lldb out/Debug/node test/parallel/test-stream-wrap.js
(lldb) br s -f stream_wrap.cc -l 58
In test-stream-wrap.js we have:
const ShutdownWrap = process.binding('stream_wrap').ShutdownWrap;
This will break in LibuvStreamWrap::Initialize
. This function will create two constructor functions,
one named ShutdownWrap
and WriteWrap
. Both share the same actual constructor function which is
defined as a lambda:
auto is_construct_call_callback = [](const FunctionCallbackInfo<Value>& args) {
CHECK(args.IsConstructCall());
StreamReq::ResetObject(args.This());
};
Local<FunctionTemplate> sw = FunctionTemplate::New(env->isolate(), is_construct_call_callback);
sw->InstanceTemplate()->SetInternalFieldCount(StreamReq::kStreamReqField + 1);
Local<String> wrapString = FIXED_ONE_BYTE_STRING(env->isolate(), "ShutdownWrap");
sw->SetClassName(wrapString);
AsyncWrap::AddWrapMethods(env, sw);
target->Set(wrapString, sw->GetFunction());
env->set_shutdown_wrap_template(sw->InstanceTemplate());
> process.binding('stream_wrap')
{ ShutdownWrap: [Function: ShutdownWrap],
WriteWrap: [Function: WriteWrap] }
AsyncWrap::AddWrapMethods
adds the getAsyncId
and asyncReset
functions to ShutdownWrap function template.
The constructor for LibuvStreamWrap
delegates to StreamBase(env)
which does:
PushStreamListener(&default_listener_);
default_listener_
can be found in stream_base.h
and is of type EmitToJSStreamListener
:
class EmitToJSStreamListener : public ReportWritesToJSStreamListener {
public:
void OnStreamRead(ssize_t nread, const uv_buf_t& buf) override;
};
src/stream_base.h
includes req_wrap-inl.h
but does not use the ReqWrap
which is the
only class req_wrap
contains. Commenting out this include will cause a number
of compile time warnings and link time errors, for example:
sers/danielbevenius/work/nodejs/node/out/Debug/obj.target/node_lib/src/tls_wrap.o ../src/tls_wrap.cc
In file included from ../src/tcp_wrap.cc:22:
In file included from ../src/tcp_wrap.h:27:
In file included from ../src/async_wrap.h:27:
../src/base_object.h:41:32: warning: inline function 'node::BaseObject::object' is not defined [-Wundefined-inline]
inline v8::Local<v8::Object> object();
^
../src/stream_base-inl.h:45:26: note: used here
return GetAsyncWrap()->object();
^
So if we follow this, line 22 of tcp_wrap.cc
is the include of tcp_wrap.h
, and line 27 of tcp_wrap.h
is
the include of async_wrap.h
. Line 27 of async_wrap.h
is the include of base_object.h
and line 41:
inline v8::Local<v8::Object> object();
And if we look at stream_base-inl.h
line 45:
inline v8::Local<v8::Object> StreamReq::object() {
return GetAsyncWrap()->object();
}
req_wrap-inl.h
includes async_wrap-inl.h
, which in turn includes base_object-inl.h
which has the following
definition of object():
inline v8::Local<v8::Object> BaseObject::object() {
return PersistentToLocal(env_->isolate(), persistent_handle_);
}
If stream_base.h
includes async_wrap-inl.h
it will get a definition of object()
since async_wrap-inl.h
includes base_object-inl.h'. But removing
req_wrap-inl.hwill also have an affect on connect_wrap.h which will no longer have a definition of
Dispatch, so it should inlude
req_wrap-inl.h instead of just 'req_wrap.h
TTYWrap
lib/tty.js
:
const { TTY, isTTY } = process.binding('tty_wrap');
Take a closer look at the duplication in lib/internal/process/stdio.js.
test-child-process-spawnsync-validation-errors.js
Just a note for myself that you need to run as non-root when running the tests or you'll get the following error:
bash-4.2# out/Release/node test/parallel/test-child-process-spawnsync-validation-errors.js
Mismatched innerFn function calls. Expected exactly 62, actual 42.
at Object.exports.mustCall (/home/danbev/node/test/common/index.js:427:10)
at Object.expectsError (/home/danbev/node/test/common/index.js:736:18)
at Object.<anonymous> (/home/danbev/node/test/parallel/test-child-process-spawnsync-validation-errors.js:14:12)
at Module._compile (internal/modules/cjs/loader.js:677:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:688:10)
at Module.load (internal/modules/cjs/loader.js:588:32)
at tryModuleLoad (internal/modules/cjs/loader.js:527:12)
at Function.Module._load (internal/modules/cjs/loader.js:519:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:718:10)
I sometime need to setup run in docker containers and this has hit me then. Just create a user and switch to that user and things should work.
Bundled ca with openssl-system-ca-path
Currently it is possible to specify that system ca certs should be included using --openssl-system-ca-path
.
There is a test named test-tls-cnnic-whitelist.js that uses the
--use-bundled-ca` option which fails (since
8.11.0):
$ out/Debug/node --use-bundled-ca test/parallel/test-tls-cnnic-whitelist.js
assert.js:80
throw new AssertionError(obj);
^
AssertionError [ERR_ASSERTION]: 'CERT_SIGNATURE_FAILURE' strictEqual 'UNABLE_TO_VERIFY_LEAF_SIGNATURE'
at TLSSocket.client.on.common.mustCall (/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-cnnic-whitelist.js:64:14)
at TLSSocket.<anonymous> (/Users/danielbevenius/work/nodejs/node/test/common/index.js:467:15)
at TLSSocket.emit (events.js:182:13)
at emitErrorNT (internal/streams/destroy.js:75:8)
at process._tickCallback (internal/process/next_tick.js:174:19)
So for some reason the clients certificate is not acceptable to the server, in this case the error is
CERT_SIGNATURE_FAILURE
, but in the master branch the expected error is UNABLE_TO_VERIFY_LEAF_SIGNATURE
.
The following clientOptions
are used in the above failed case:
{
// Test 0: for the check of a cert not in the whitelist.
// agent7-cert.pem is issued by the fake CNNIC root CA so that its
// hash is not listed in the whitelist.
// fake-cnnic-root-cert has the same subject name as the original
// rootCA.
serverOpts: {
key: loadPEM('agent7-key'),
cert: loadPEM('agent7-cert')
},
clientOpts: {
port: undefined,
rejectUnauthorized: true,
ca: [loadPEM('fake-cnnic-root-cert')]
},
errorCode: 'UNABLE_TO_VERIFY_LEAF_SIGNATURE'
}
There was an issue with this test which caused a different error than the one expected, see https://github.com/nodejs/node/pull/19767 for details.
I think that this test was not worked as expected sinse 2bc7841d0fcdd066fe477873229125b6f003b693
("test: use random ports where possible"). The test in that commit checked for CERT_REVOKED
.
I tried checking out that version and running it, and with the update to the cert and the
change in the above mentioned PR it still worked. I noticed that this was done by src/node_crypto.cc
and the function CheckWhitelistedServerCert
which has since been removed. Lets find out when it
was removed:
$ git blame --reverse 2bc7841d0fcdd066fe477873229125b6f003b693.. src/node_crypto.cc
test: remove test case 0 from test-tls-cnnic-whitelist.js
I looks like this test has not worked as expected since commit
2bc7841d0fcdd066fe477873229125b6f003b693 ("test: use random ports
where possible"). The test in that commit checked for CERT_REVOKED
which was returned by CheckWhitelistedServerCert.
CheckWhitelistedServerCert was removed in commit
dc875438a3953102febffa79b691317bb24ba2aa ("src: drop CNNIC+StartCom
certificate whitelisting").
I'm suggesting that this test case be removed as I don't think it is valid anymore.
It turns out that the fake-cnnic-root-cert
has expired:
$ openssl x509 -in fake-cnnic-root-cert.pem -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
c2:79:4a:2b:ea:49:93:6d
Signature Algorithm: sha256WithRSAEncryption
Issuer: C=CN, O=CNNIC, CN=CNNIC ROOT
Validity
Not Before: Jun 9 17:15:16 2015 GMT
Not After : Mar 29 17:15:16 2018 GMT
Lets create a new one and see if the test works now:
$ rm fake-cnnic-root-cert.pem
$ make fake-cnnic-root-cert.pem
openssl req -x509 -new \
-key fake-cnnic-root-key.pem \
-days 1024 \
-out fake-cnnic-root-cert.pem \
-config fake-cnnic-root.cnf
$ openssl x509 -in fake-cnnic-root-cert.pem -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
a9:dd:e8:f6:46:aa:9b:73
Signature Algorithm: sha1WithRSAEncryption
Issuer: C=CN, O=CNNIC, CN=CNNIC ROOT
Validity
Not Before: Apr 13 06:23:48 2018 GMT
Not After : Jan 31 06:23:48 2021 GMT
With those updates a re-run of the test will just "hang", actuall it is now waiting for incoming connections:
$ out/Debug/node --use-bundled-ca test/parallel/test-tls-cnnic-whitelist.js
So what is this test really expecting? Notice that --use-bundled-ca is specified and these have recently been updated. In commit 79fa372b79 ("crypto: update root certificates") the CNNIC Root certificate was removed:
Certificates removed:
- CNNIC ROOT
So, lets debug this:
const client = tls.connect(tcase.clientOpts);
This call will land in lib/_tls_wrap.js
which will setup the options and the call
const context = options.secureContext || tls.createSecureContext(options);
createSecureContext
can be found in _tls_common.js
:
const { SecureContext: NativeSecureContext } = internalBinding('crypto');
...
var c = new SecureContext(options.secureProtocol, secureOptions, context);
So the error means that the server cannot verify the signature of the of the ca
"no signatures could be verified because the chain contains only one certificate and it is not self signed."
TLS sreateServer
This function can be found in lib/_tlw_wrap.js
:
exports.createServer = function(options, listener) {
return new Server(options, listener);
};
$ lldb -- out/Debug/node --inspect-brk ../scripts/tls-server.js
const server = tls.Server(options, function(socket) {
tls.Server
is a function defined as :
exports.Server = require('_tls_wrap').Server;
This will land in lib/_tls_wrap and the Server function which will parst the options passed in and
store them.
this.setOptions(options);
var sharedCreds = tls.createSecureContext({
pfx: this.pfx,
key: this.key,
passphrase: this.passphrase,
cert: this.cert,
clientCertEngine: this.clientCertEngine,
ca: this.ca,
ciphers: this.ciphers,
ecdhCurve: this.ecdhCurve,
dhparam: this.dhparam,
secureProtocol: this.secureProtocol,
secureOptions: this.secureOptions,
honorCipherOrder: this.honorCipherOrder,
crl: this.crl,
sessionIdContext: this.sessionIdContext
});
This will call lib/_tls_common.js
createSecureContext
var c = new SecureContext(options.secureProtocol, secureOptions, context);
Will this.context = new NativeSecureContext();
This will call src/node_crypto.h SecureContext::New
context.init will call SecureContext::Init:
const SSL_METHOD* method = TLS_method();
TLS_method
is defined in deps/openssl/openssl/ssl/:
IMPLEMENT_tls_meth_func(TLS_ANY_VERSION, 0, 0,
TLS_method,
ossl_statem_accept,
ossl_statem_connect, TLSv1_2_enc_data)
IMPLEMENT_tls_meth_func
is a macro in deps/openssl/openssl/ssl/ssl_locl.h
. This function will return
a referense to a const struct:
struct ssl_method_st {
int version;
unsigned flags;
unsigned long mask;
int (*ssl_new) (SSL *s);
void (*ssl_clear) (SSL *s);
void (*ssl_free) (SSL *s);
int (*ssl_accept) (SSL *s);
int (*ssl_connect) (SSL *s);
int (*ssl_read) (SSL *s, void *buf, int len);
int (*ssl_peek) (SSL *s, void *buf, int len);
int (*ssl_write) (SSL *s, const void *buf, int len);
int (*ssl_shutdown) (SSL *s);
int (*ssl_renegotiate) (SSL *s);
int (*ssl_renegotiate_check) (SSL *s);
int (*ssl_read_bytes) (SSL *s, int type, int *recvd_type,
unsigned char *buf, int len, int peek);
int (*ssl_write_bytes) (SSL *s, int type, const void *buf_, int len);
int (*ssl_dispatch_alert) (SSL *s);
long (*ssl_ctrl) (SSL *s, int cmd, long larg, void *parg);
long (*ssl_ctx_ctrl) (SSL_CTX *ctx, int cmd, long larg, void *parg);
const SSL_CIPHER *(*get_cipher_by_char) (const unsigned char *ptr);
int (*put_cipher_by_char) (const SSL_CIPHER *cipher, unsigned char *ptr);
int (*ssl_pending) (const SSL *s);
int (*num_ciphers) (void);
const SSL_CIPHER *(*get_cipher) (unsigned ncipher);
long (*get_timeout) (void);
const struct ssl3_enc_method *ssl3_enc; /* Extra SSLv3/TLS stuff */
int (*ssl_version) (void);
long (*ssl_callback_ctrl) (SSL *s, int cb_id, void (*fp) (void));
long (*ssl_ctx_callback_ctrl) (SSL_CTX *s, int cb_id, void (*fp) (void));
};
(lldb) expr *method
(SSL_METHOD) $54 = {
version = 65536
flags = 0
mask = 0
ssl_new = 0x00000001019bd2f0 (node`tls1_new at t1_lib.c:98)
ssl_clear = 0x00000001019bd380 (node`tls1_clear at t1_lib.c:112)
ssl_free = 0x00000001019bd340 (node`tls1_free at t1_lib.c:106)
ssl_accept = 0x00000001019a5660 (node`ossl_statem_accept at statem.c:174)
ssl_connect = 0x00000001019a4fa0 (node`ossl_statem_connect at statem.c:169)
ssl_read = 0x0000000101988740 (node`ssl3_read at s3_lib.c:3875)
ssl_peek = 0x00000001019888a0 (node`ssl3_peek at s3_lib.c:3880)
ssl_write = 0x00000001019885f0 (node`ssl3_write at s3_lib.c:3836)
ssl_shutdown = 0x0000000101988450 (node`ssl3_shutdown at s3_lib.c:3786)
ssl_renegotiate = 0x00000001019888d0 (node`ssl3_renegotiate at s3_lib.c:3885)
ssl_renegotiate_check = 0x0000000101988660 (node`ssl3_renegotiate_check at s3_lib.c:3897)
ssl_read_bytes = 0x000000010197c250 (node`ssl3_read_bytes at rec_layer_s3.c:976)
ssl_write_bytes = 0x000000010197a2e0 (node`ssl3_write_bytes at rec_layer_s3.c:344)
ssl_dispatch_alert = 0x00000001019894b0 (node`ssl3_dispatch_alert at s3_msg.c:68)
ssl_ctrl = 0x0000000101985ba0 (node`ssl3_ctrl at s3_lib.c:2898)
ssl_ctx_ctrl = 0x0000000101986ce0 (node`ssl3_ctx_ctrl at s3_lib.c:3261)
get_cipher_by_char = 0x0000000101987d20 (node`ssl3_get_cipher_by_char at s3_lib.c:3564)
put_cipher_by_char = 0x0000000101987d80 (node`ssl3_put_cipher_by_char at s3_lib.c:3576)
ssl_pending = 0x0000000101979b70 (node`ssl3_pending at rec_layer_s3.c:130)
num_ciphers = 0x0000000101985720 (node`ssl3_num_ciphers at s3_lib.c:2781)
get_cipher = 0x0000000101985730 (node`ssl3_get_cipher at s3_lib.c:2786)
get_timeout = 0x00000001019bd2e0 (node`tls1_default_timeout at t1_lib.c:89)
ssl3_enc = 0x0000000102bd58f0
ssl_version = 0x000000010199ac90 (node`ssl_undefined_void_function at ssl_lib.c:3251)
ssl_callback_ctrl = 0x0000000101986c30 (node`ssl3_callback_ctrl at s3_lib.c:3233)
ssl_ctx_callback_ctrl = 0x0000000101987ab0 (node`ssl3_ctx_callback_ctrl at s3_lib.c:3508)
}
sc->ctx_ = SSL_CTX_new(method);
SSL_CTX_set_app_data(sc->ctx_, sc);
The last call is setting the created SecureContext on the OpenSSL CTX created with the previous call. This allows user data to be stored with the context. This function will return 1 if they items were stored successfully and 0 if not. Get will return the data or NULL.
Next:
SSL_CTX_set_options(sc->ctx_, SSL_OP_NO_SSLv2);
SSL_OP_NO_SSLv2 Do not use the SSLv2 protocol. As of OpenSSL 1.0.2g the SSL_OP_NO_SSLv2 option is set by default.
SSL_CTX_set_session_cache_mode(sc->ctx_, SSL_SESS_CACHE_SERVER | SSL_SESS_CACHE_NO_INTERNAL | SSL_SESS_CACHE_NO_AUTO_CLEAR);
Enables session caching. To reuse a session a client must send the session id to the server.
SSL_CTX_set_tlsext_ticket_key_cb(sc->ctx_, SecureContext::TicketCompatibilityCallback); Session tickets are defined in RFC5077 provide an enhanced session resumption capability where the server implementation is not required to maintain per session state.
Back in lib/_tls_common.js
we have
if (secureOptions) this.context.setOptions(secureOptions);
After this we will have returned from the call to new SecureContext.
var ca = options.ca;
If a ca
options was specifed for the options passed into the tls.Server
function.
c.context.addRootCerts();
This will land in SecureContext::AddRootCerts
which in our case the will no be ca certs parsed yet
so the ones in root_certs
static std::vector<X509*> root_certs_vector;
if (root_certs_vector.empty()) {
GCM
This algorithm produces both a cipher text and an authentication tag (think MAC). Example encryption/decryption:
const crypto = require('crypto');
const algo = 'aes-128-gcm';
const key = '6970787039613669314d623455536234';
const iv = '583673497131313748307652';
const plainText = 'Bajja!';
const options = {};
console.log('PlainText: ', plainText);
const encrypt = crypto.createCipheriv(algo,
Buffer.from(key, 'hex'),
Buffer.from(iv, 'hex'),
options);
let cipherText = encrypt.update(plainText, 'utf8', 'hex');
cipherText += encrypt.final('hex');
console.log('CipherText:', cipherText);
const decrypt = crypto.createDecipheriv(algo,
Buffer.from(key, 'hex'),
Buffer.from(iv, 'hex'),
options);
// You have to remember tht GCM is both encryption and authentiction
// and you have to supply the mac/tag that was generated
decrypt.setAuthTag(Buffer.from(encrypt.getAuthTag()));
let decrypted = decrypt.update(cipherText, 'hex', 'utf8');
decrypted += decrypt.final('utf8');
console.log('Decrypted: ', decrypted);
In node_crypto.cc
CipherBase::Finalwe can see that the authentication token is set. The following call be get the tag and set the value of
auth_tag_`.
EVP_CIPHER_CTX_ctrl(ctx_, EVP_CTRL_GCM_GET_TAG, auth_tag_len_, reinterpret_cast<unsigned char*>(auth_tag_)));
Authenticated Encryption with Associated Data (AEAD)
You can also add additional data that will be covered by the authentication tag but not encrypted. This can be useful for package headers which should be allowed to be read, but you still don't want them to be tampered with. Lets take a look at an example:
const crypto = require('crypto');
const algo = 'aes-128-gcm';
const key = '6970787039613669314d623455536234';
const iv = '583673497131313748307652';
const plainText = 'Bajja!';
const options = {};
console.log('PlainText: ', plainText);
const encrypt = crypto.createCipheriv(algo,
Buffer.from(key, 'hex'),
Buffer.from(iv, 'hex'),
options);
encrypt.setAAD(Buffer.from('someheader=10', 'utf8'));
let cipherText = encrypt.update(plainText, 'utf8', 'hex');
cipherText += encrypt.final('hex');
console.log('CipherText:', cipherText);
// Send the cipherText, additional data, and tag someone
const decrypt = crypto.createDecipheriv(algo,
Buffer.from(key, 'hex'),
Buffer.from(iv, 'hex'),
options);
// You have to remember that GCM is both encryption and authentiction
// and you have to supply the mac/tag that was generated
decrypt.setAuthTag(Buffer.from(encrypt.getAuthTag()));
// We have to set the additional data before decrypting
// as decryption is defined as ADAD(K, C, A, T) = (P, A)
decrypt.setAAD(Buffer.from('someheader=10', 'utf8'));
let decrypted = decrypt.update(cipherText, 'hex', 'utf8');
decrypted += decrypt.final('utf8');
// Notice that the additional data is not part of the decrypted message
console.log('Decrypted: ', decrypted);
(lldb) br s -n CipherBase::SetAAD
util.h
This header contains the following util functions/macros:
UncheckedRealloc ... Abort Assert void DumpBacktrace(FILE* fp); FIXED_ONE_BYTE_STRING STRINGIFY_(x) #x STRINGIFY(x) STRINGIFY_(x) CHECK CHECK_EQ ... ListNode ContainerOfHelper
arraysize can be found in `src/node_internals.h'. Just be aware that there is a macro in v8 with the same name (deps/v8/src/base/macros.h)
src/util-inl.h
getUIntOption
This function is part of the lib/internal/crypto/cipher.js
;
function getUIntOption(options, key) {
let value;
if (options && (value = options[key]) != null) {
if (value >>> 0 !== value)
throw new ERR_INVALID_OPT_VALUE(key, value);
return value;
}
return -1;
}
What was intersting is the usage of >>>
, the zero filling right shift operator.
This ensures that value is a valid number (value exists, is numeric, and is integral). If it is it will be unaffected
by the operation. If it's undefined or non-numeric, it will always return zero.
You can also use >>
to do the same thing but for signed int values.
Libraries that the node executable requires
$ otool -L out/Release/node
out/Release/node:
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1259.22.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.1.0)
On mac the System library contains the following components:
- libc The standard c library
- libinfo The netinfo library
- libkvm The kernel virtual memory library
- libm The math lib
- libpthread The POSIX threads library
You can see that these libs are symbolic links:
$ ls -l /usr/lib/libpthread.dylib
lrwxr-xr-x 1 root wheel 15 Jan 13 2016 /usr/lib/libpthread.dylib -> libSystem.dylib
And you can list the symbols using:
$ nm /usr/lib/libSystem.B.dylib
...
The other linked library is libc++.1.dylib
which is the implementation of the C++ standard library, targeting C++11.
For node-gyp you will also need python 2.7, make, and a C++ compiler toolchain.
Module exports
In lib/crypto.js
we have the following line:
module.exports = exports = {
When crypto is required it will have a NativeModule instance created for it, which can be seen in
internal/bootstrap/loaders
:
const nativeModule = new NativeModule(id);
nativeModule.cache();
nativeModule.compile();
And later the function 'fn' will be called.
const script = new ContextifyScript(source, this.filename);
// Arguments: timeout, displayErrors, breakOnSigint
const fn = script.runInThisContext(-1, true, false);
const requireFn = this.id.startsWith('internal/deps/') ?
NativeModule.requireForDeps :
NativeModule.require;
fn(this.exports, requireFn, this, process);
Notice that this.export
and this are being passed in. This will be set to the exports
variable and this
will
be set to the module
varialbe:
(function (exports, require, module, process)
But I'm surprised to see the module.exports = exports =
statement. Why would the exports object have to be set
unless they are pointing to different things?
Well this is because we are overwriting module.exports, and in this case exports and module.exports will not point
to the same object.
Lets take a closer look at:
const fn = script.runInThisContext(-1, true, false);
(lldb) br s -f node_contextify.cc -l 733
TLS
This section is going to take a closer look at what happens on the server side when a client connect using tls. Take the following example:
const tls = require('tls');
const path = require('path');
const fs = require('fs');
const port = 1888;
const options = {
key: fs.readFileSync(path.join(__dirname, 'danbev-key.pem')),
cert: fs.readFileSync(path.join(__dirname, 'danbev-cert.pem'))
};
const server = tls.Server(options, function(socket) {
console.log('Connected : ', socket);
});
server.listen(port, function() {
console.log('Server listening on port ', port);
});
tls.Server
will land in _tls_wrap.js
. The first part of this constructor function checks if options
is a function, in which case it is used as the connection listener (more about it below).
Each server instance has an array of contexts which is initially empty:
this._contexts = [];
After this, the options are set using setOptions
(not sure why but this function is missing a name:
Server.prototype.setOptions = function(options) {
...
}
This should probably be:
Server.prototype.setOptions = function setOptions(options) {
...
}
So what options do we have available:
-
requestCert Clients always request the server certificate to authenticate it and servers can optionally do this by setting this option.
-
rejectUnauthorized Used in conjunction with requestClient (it has to be true) and will be used to set the verifyMode:
if (requestCert || rejectUnauthorized)
ssl.setVerifyMode(requestCert, rejectUnauthorized);
A sessionIdContext, used for session resumption, will be created in setOptions
if one was not passed in:
s.sessionIdContext = crypto.createHash('sha1')
.update(process.argv.join(' '))
.digest('hex')
.slice(0, 32);
Next, tls.createSecureContext
is called which can be found in _tls_common.js
. In this function there are a few
options checks and then a new SecureContext will be create:
const c = new SecureContext(options.secureProtocol, secureOptions, context);
This will call the constructor function SecureContext in also in _tls_common.js
. In our case there was no existing context
passed in so the following path will be taken:
this.context = new NativeSecureContext();
Note that NativeSecureContext
is bound using:
const { SecureContext: NativeSecureContext } = process.binding('crypto');
So this will land in node_crypto.cc and SecureContext::New
void SecureContext::New(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
new SecureContext(env, args.This());
}
Back in _tls_wrap.js
and the Server constructor function after the call to var sharedCreds = tls.createSecureContext
.
Next, the constructor will call net.Server
's constructor:
net.Server.call(this, tlsConnectionListener);
This will pass in the tlsConnectionListener
which will be set in the above called constructor:
this.on('connection', connectionListener);
The connection listener looks like this:
function tlsConnectionListener(rawSocket) {
const socket = new TLSSocket(rawSocket, {
secureContext: this._sharedCreds,
isServer: true,
server: this,
requestCert: this.requestCert,
rejectUnauthorized: this.rejectUnauthorized,
handshakeTimeout: this[kHandshakeTimeout],
ALPNProtocols: this.ALPNProtocols,
SNICallback: this[kSNICallback] || SNICallback
});
socket.on('secure', onSocketSecure);
socket[kErrorEmitted] = false;
socket.on('close', onSocketClose);
socket.on('_tlsError', onSocketTLSError);
}
TLSSocket is a JavaScript land class also defined in _tls_wrap.js
.
After net.Server
returns we have the following in _tls_wrap
's Server constructor function:
if (listener) {
this.on('secureConnection', listener);
}
The listener here is the one passed in to 'tls.Server', so in our case it just logs a message.
So that is creation of the server
object instance in our example. Next will call server.listen
:
server.listen(port, function() {
console.log('Server listening on port ', port);
});
listen
is defined in net.js
. After some checking and normalizing of options the following will be called:
listenInCluster(this, null, options.port | 0, 4, backlog, undefined, options.exclusive);
The definition of listenInCluster
look like this:
function listenInCluster(server, address, port, addressType, backlog, fd, exclusive) {
...
server._listen2(address, port, addressType, backlog, fd);
}
listen2
is an alias for setupListenHandle
:
Server.prototype._listen2 = setupListenHandle;
val = createServerHandle('::', port, 6, fd);
handle = new TCP(TCPConstants.SERVER);
This call will land in tcp_wrap.cc
's TCPWrap::New function
void TCPWrap::New(const FunctionCallbackInfo<Value>& args) {
// This constructor should not be exposed to public javascript.
// Therefore we assert that we are not trying to call this as a
// normal function.
CHECK(args.IsConstructCall());
CHECK(args[0]->IsInt32());
Environment* env = Environment::GetCurrent(args);
int type_value = args[0].As<Int32>()->Value();
TCPWrap::SocketType type = static_cast<TCPWrap::SocketType>(type_value);
ProviderType provider;
switch (type) {
case SOCKET:
provider = PROVIDER_TCPWRAP;
break;
case SERVER:
provider = PROVIDER_TCPSERVERWRAP;
break;
default:
UNREACHABLE();
}
new TCPWrap(env, args.This(), provider);
}
When TCPWrap is called, what is really happening is that the
(lldb) jlh this->This()
0x186f76c6b591: [JS_API_OBJECT_TYPE]
- map: 0x186f88ee1481 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x186ff17efcb1 <Object map = 0x186f9db4fcd1>
- elements: 0x186ffb702251 <FixedArray[0]> [HOLEY_ELEMENTS]
- embedder fields: 1
- properties: 0x186f76c6b5b1 <PropertyArray[6]> {
#reading: 0x186ffb7023f1 <false> (data field 0) properties[0]
#owner: 0x186ffb702201 <null> (data field 1) properties[1]
#onread: 0x186ffb702201 <null> (data field 2) properties[2]
#onconnection: 0x186ffb702201 <null> (data field 3) properties[3]
}
- embedder fields = {
0x186ffb7022e1
}
These properties are configured in TCPWrap::Initialize:
Local<FunctionTemplate> t = env->NewFunctionTemplate(New);
Local<String> tcpString = FIXED_ONE_BYTE_STRING(env->isolate(), "TCP");
t->SetClassName(tcpString);
t->InstanceTemplate()->SetInternalFieldCount(1);
// Init properties
t->InstanceTemplate()->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "reading"),
Boolean::New(env->isolate(), false));
t->InstanceTemplate()->Set(env->owner_string(), Null(env->isolate()));
t->InstanceTemplate()->Set(env->onread_string(), Null(env->isolate()));
t->InstanceTemplate()->Set(env->onconnection_string(), Null(env->isolate()));
How does the object returned by this->This()
above get created?
(lldb) br s -f builtins-api.cc -l 126
We can take a look at deps/v8/src/builtins/builtins-api.cc
:
BUILTIN(HandleApiCall) {
HandleScope scope(isolate);
Handle<JSFunction> function = args.target();
Handle<Object> receiver = args.receiver();
Handle<HeapObject> new_target = args.new_target();
Handle<FunctionTemplateInfo> fun_data(function->shared()->get_api_func_data(),
isolate);
if (new_target->IsJSReceiver()) {
RETURN_RESULT_OR_FAILURE(isolate,
HandleApiCallHelper<true>(isolate, function, new_target, fun_data, receiver, args));
BUILTIN
is a macro (deps/v8/src/builtins/builtins-utils.h). Here is what the above would look like once expanded
by the preprocessor:
MUST_USE_RESULT static Object* Builtin_Impl_HandleApiCall(BuiltinArguments args Isolate* isolate);
V8_NOINLINE static Object* Builtin_Impl_Stats_HandleApiCall(int args_length, Object** args_object, Isolate* isolate) {
BuiltinArguments args(args_length, args_object);
RuntimeCallTimerScope timer(isolate, RuntimeCallCounterId::kBuiltin_HandleApiCall);
TRACE_EVENT0(TRACE_DISABLED_BY_DEFAULT("v8.runtime"), "V8.Builtin_HandleApiCall");
return Builtin_Impl_HandleApiCall(args, isolate);
}
MUST_USE_RESULT Object* Builtin_HandleApiCall(int args_length, Object** args_object, Isolate* isolate) {
DCHECK(isolate->context() == nullptr || isolate->context()->IsContext());
if (V8_UNLIKELY(FLAG_runtime_stats)) {
return Builtin_Impl_Stats_HandleApiCall(args_length, args_object, isolate);
}
BuiltinArguments args(args_length, args_object);
return Builtin_Impl_HandleApiCall(args, isolate);
}
MUST_USE_RESULT static Object* Builtin_Impl_HandleApiCall(BuiltinArguments args, Isolate* isolate) {
HandleScope scope(isolate);
Handle<JSFunction> function = args.target();
Handle<Object> receiver = args.receiver();
Handle<HeapObject> new_target = args.new_target();
Handle<FunctionTemplateInfo> fun_data(function->shared()->get_api_func_data(),
isolate);
if (new_target->IsJSReceiver()) {
RETURN_RESULT_OR_FAILURE(isolate,
HandleApiCallHelper<true>(isolate, function, new_target, fun_data, receiver, args));
}
(lldb) expr receiver->Print()
#hole
(lldb) expr function->shared()->Print()
0x3e62e0db9829: [SharedFunctionInfo] in OldSpace
- map: 0x3e62323027f1 <Map(HOLEY_ELEMENTS)>
- name: 0x3e62e0da4f11 <String[3]: TCP>
- kind: NormalFunction
- function_map_index: 128
- formal_parameter_count: -1
- expected_nof_properties: 0
- language_mode: sloppy - code: 0x16dedcb1eea1 <Code BUILTIN>
- function token position: 0
- start position: 0
- end position: 0
- no debug info
- scope info: 0x3e62a2602459 <ScopeInfo[0]>
- length: 0
- feedback_metadata: 0x3e62a2602251: [FeedbackMetadata] in OldSpace
- map: 0x3e6232302341 <Map(HOLEY_ELEMENTS)>
- length: 0 (empty)
- no preparsed scope data
(lldb) expr function->shared()->get_api_func_data()
(v8::internal::FunctionTemplateInfo *) $9 = 0x00003e62e0db6339
get_api_func_data
:
FunctionTemplateInfo* SharedFunctionInfo::get_api_func_data() {
DCHECK(IsApiFunction());
return FunctionTemplateInfo::cast(function_data());
}
Lets take a closer look at HandleApiCallHelper
which is called with the following arguments:
RETURN_RESULT_OR_FAILURE(
isolate, HandleApiCallHelper<true>(isolate, function, new_target,
fun_data, receiver, args));
template <bool is_construct>
MUST_USE_RESULT MaybeHandle<Object> HandleApiCallHelper(
Isolate* isolate, Handle<HeapObject> function,
Handle<HeapObject> new_target, Handle<FunctionTemplateInfo> fun_data,
Handle<Object> receiver, BuiltinArguments args) {
Handle<JSObject> js_receiver;
JSObject* raw_holder;
if (is_construct) {
DCHECK(args.receiver()->IsTheHole(isolate));
if (fun_data->instance_template()->IsUndefined(isolate)) {
v8::Local<ObjectTemplate> templ =
ObjectTemplate::New(reinterpret_cast<v8::Isolate*>(isolate),
ToApiHandle<v8::FunctionTemplate>(fun_data));
fun_data->set_instance_template(*Utils::OpenHandle(*templ));
}
Handle<ObjectTemplateInfo> instance_template(
ObjectTemplateInfo::cast(fun_data->instance_template()), isolate);
ASSIGN_RETURN_ON_EXCEPTION(
isolate, js_receiver,
ApiNatives::InstantiateObject(instance_template,
Handle<JSReceiver>::cast(new_target)),
Object);
args[0] = *js_receiver;
DCHECK_EQ(*js_receiver, *args.receiver());
raw_holder = *js_receiver;
} else {
Notice that the ObjectTemplateInfo is retrieved from the shared function data:
Handle<ObjectTemplateInfo> instance_template(ObjectTemplateInfo::cast(fun_data->instance_template()), isolate);
If we inspect it we find:
lldb) expr instance_template->number_of_properties()
(int) $28 = 4
(lldb) expr instance_template->property_list()
(v8::internal::Object *) $29 = 0x00003e62ada58031
(lldb) expr instance_template->property_list()->Print()
0x3e62ada58031: [FixedArray] in OldSpace
- map: 0x3e6232302341 <Map(HOLEY_ELEMENTS)>
- length: 20
0: 12
1: 0x3e62ada58011 <String[7]: reading>
2: 192
3: 0x3e62a26023f1 <false>
4: 0x3e62afc83339 <String[5]: owner>
5: 192
6: 0x3e62a2602201 <null>
7: 0x3e62afc83139 <String[6]: onread>
8: 192
9: 0x3e62a2602201 <null>
10: 0x3e62afc82f21 <String[12]: onconnection>
11: 192
12: 0x3e62a2602201 <null>
13-19: 0x3e62a2602321 <the_hole>
Now, if we look back at TCPWrap::Initialize
we can see that this matches the properties set:
t->InstanceTemplate()->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "reading"),
Boolean::New(env->isolate(), false));
t->InstanceTemplate()->Set(env->owner_string(), Null(env->isolate()));
t->InstanceTemplate()->Set(env->onread_string(), Null(env->isolate()));
t->InstanceTemplate()->Set(env->onconnection_string(), Null(env->isolate()));
The next call is to instantiate the object:
ASSIGN_RETURN_ON_EXCEPTION(
isolate, js_receiver,
ApiNatives::InstantiateObject(instance_template,
Handle<JSReceiver>::cast(new_target)),
Object);
Ignoring the macro for now lets take a closer look at ApiNatives::InstantiateObject
which we can find
in deps/v8/src/api-natives.cc
:
Isolate* isolate = data->GetIsolate();
InvokeScope invoke_scope(isolate);
return ::v8::internal::InstantiateObject(isolate, data, new_target, false, false);
MaybeHandle<JSObject> InstantiateObject(Isolate* isolate,
Handle<ObjectTemplateInfo> info,
Handle<JSReceiver> new_target,
bool is_hidden_prototype,
bool is_prototype) {
int serial_number = Smi::ToInt(info->serial_number());
if (!new_target.is_null()) {
if (IsSimpleInstantiation(isolate, *info, *new_target)) {
constructor = Handle<JSFunction>::cast(new_target);
...
Handle<JSObject> object;
ASSIGN_RETURN_ON_EXCEPTION(isolate, object,
JSObject::New(constructor, new_target), JSObject);
...
The call to JSObject::New
will end up in deps/v8/src/objects.cc
which will create the instance,
and then that new instance, object
above, will be configured:
ASSIGN_RETURN_ON_EXCEPTION(
isolate, result,
ConfigureInstance(isolate, object, info, is_hidden_prototype), JSObject);
ConfigureInstance
is also in deps/v8/src/api-natives.cc
.
Object* maybe_property_list = data->property_list();
We can inspect the property_list
and verify that it matches the properties set using:
(lldb) expr maybe_property_list->Print()
0x3e62ada58031: [FixedArray] in OldSpace
- map: 0x3e6232302341 <Map(HOLEY_ELEMENTS)>
- length: 20
0: 12
1: 0x3e62ada58011 <String[7]: reading>
2: 192
3: 0x3e62a26023f1 <false>
4: 0x3e62afc83339 <String[5]: owner>
5: 192
6: 0x3e62a2602201 <null>
7: 0x3e62afc83139 <String[6]: onread>
8: 192
9: 0x3e62a2602201 <null>
10: 0x3e62afc82f21 <String[12]: onconnection>
11: 192
12: 0x3e62a2602201 <null>
13-19: 0x3e62a2602321 <the_hole>
Notice that we have first an integer, then the string, another integer and a boolean value.
Next the properties iterated through:
for (int c = 0; c < data->number_of_properties(); c++) {
auto name = handle(Name::cast(properties->get(i++)), isolate);
Object* bit = properties->get(i++);
if (bit->IsSmi()) {
PropertyDetails details(Smi::cast(bit));
PropertyAttributes attributes = details.attributes();
PropertyKind kind = details.kind();
if (kind == kData) {
auto prop_data = handle(properties->get(i++), isolate);
RETURN_ON_EXCEPTION(isolate, DefineDataProperty(isolate, obj, name,
prop_data, attributes),
JSObject);
} else {
auto getter = handle(properties->get(i++), isolate);
auto setter = handle(properties->get(i++), isolate);
RETURN_ON_EXCEPTION(
isolate, DefineAccessorProperty(isolate, obj, name, getter, setter,
attributes, is_hidden_prototype),
JSObject);
}
} else {
// Intrinsic data property --- Get appropriate value from the current
// context.
PropertyDetails details(Smi::cast(properties->get(i++)));
PropertyAttributes attributes = details.attributes();
DCHECK_EQ(kData, details.kind());
v8::Intrinsic intrinsic =
static_cast<v8::Intrinsic>(Smi::ToInt(properties->get(i++)));
auto prop_data = handle(GetIntrinsic(isolate, intrinsic), isolate);
RETURN_ON_EXCEPTION(isolate, DefineDataProperty(isolate, obj, name,
prop_data, attributes),
JSObject);
}
}
return obj;
Looking at the above the first index in the array seems to be the size or limit of the entries. So the first property is named 'reading', its bit is:
(lldb) expr name->Print()
"reading"
(lldb) expr bit->Print()
Smi: 0xc0 (192)
(lldb) expr details.kind()
(v8::internal::PropertyKind) $171 = kData
(lldb) expr details.location()
(v8::internal::PropertyLocation) $172 = kField
(lldb) expr details.attributes()
(v8::internal::PropertyAttributes) $173 = NONE
PropertyDetails
can be found in deps/v8/src/property-details.h
and the constructor that takes an Smi
can be found in deps/v8/src/objects-inl.h
:
PropertyDetails::PropertyDetails(Smi* smi) {
value_ = smi->value();
}
While stepping through the above for look we can inspect that obj
that the properties are getting added to:
(lldb) expr obj->Print()
0x38ec09dd0641: [JS_API_OBJECT_TYPE]
- map: 0x38ec21004501 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x38ecff033799 <Object map = 0x38ec483199a1>
- elements: 0x38ec11d82251 <FixedArray[0]> [HOLEY_ELEMENTS]
- embedder fields: 1
- properties: 0x38ec09dd0699 <PropertyArray[3]> {
#reading: 0x38ec11d823f1 <false> (data field 0) properties[0]
#owner: 0x38ec11d82201 <null> (data field 1) properties[1]
}
- embedder fields = {
0x38ec11d822e1
}
The above shows the reading
, and owner
properties have been added.
After returning from we'll again be in HandleApiCallHelper
:
ASSIGN_RETURN_ON_EXCEPTION(isolate, js_receiver,
ApiNatives::InstantiateObject(instance_template,
Handle<JSReceiver>::cast(new_target)),
Object);
args[0] = *js_receiver;
And we can verify that js_receiver is the object created and configured/populated:
(lldb) expr js_receiver->Print()
0x38ec09dd0839: [JS_API_OBJECT_TYPE]
- map: 0x38ec210045a1 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x38ecff033799 <Object map = 0x38ec483199a1>
- elements: 0x38ec11d82251 <FixedArray[0]> [HOLEY_ELEMENTS]
- embedder fields: 1
- properties: 0x38ec09dd0859 <PropertyArray[6]> {
#reading: 0x38ec11d823f1 <false> (data field 0) properties[0]
#owner: 0x38ec11d82201 <null> (data field 1) properties[1]
#onread: 0x38ec11d82201 <null> (data field 2) properties[2]
#onconnection: 0x38ec11d82201 <null> (data field 3) properties[3]
}
- embedder fields = {
0x38ec11d822e1
}
Next, we have:
FunctionCallbackArguments custom(isolate, data_obj, *function, raw_holder,
*new_target, &args[0] - 1,
args.length() - 1);
Handle<Object> result = custom.Call(call_data);
FunctionCallbackArguments(internal::Isolate* isolate, internal::Object* data,
internal::HeapObject* callee,
internal::Object* holder,
internal::HeapObject* new_target,
internal::Object** argv, int argc)
: Super(isolate), argv_(argv), argc_(argc) {
Object** values = begin();
values[T::kDataIndex] = data;
values[T::kHolderIndex] = holder;
values[T::kNewTargetIndex] = new_target;
values[T::kIsolateIndex] = reinterpret_cast<internal::Object*>(isolate);
// Here the hole is set as default value.
// It cannot escape into js as it's remove in Call below.
values[T::kReturnValueDefaultValueIndex] =
isolate->heap()->the_hole_value();
values[T::kReturnValueIndex] = isolate->heap()->the_hole_value();
DCHECK(values[T::kHolderIndex]->IsHeapObject());
DCHECK(values[T::kIsolateIndex]->IsSmi());
}
custom.Call
can be found in deps/v8/src/api-arguments.cc
:
Handle<Object> FunctionCallbackArguments::Call(CallHandlerInfo* handler) {
Isolate* isolate = this->isolate();
LOG(isolate, ApiObjectAccess("call", holder()));
RuntimeCallTimerScope timer(isolate, RuntimeCallCounterId::kFunctionCallback);
v8::FunctionCallback f =
v8::ToCData<v8::FunctionCallback>(handler->callback());
if (isolate->needs_side_effect_check() &&
!isolate->debug()->PerformSideEffectCheckForCallback(FUNCTION_ADDR(f))) {
return Handle<Object>();
}
VMState<EXTERNAL> state(isolate);
ExternalCallbackScope call_scope(isolate, FUNCTION_ADDR(f));
FunctionCallbackInfo<v8::Value> info(begin(), argv_, argc_);
f(info);
return GetReturnValue<Object>(isolate);
}
(lldb) expr f
(v8::FunctionCallback) $384 = 0x000000010019ce20 (node`node::TCPWrap::New(v8::FunctionCallbackInfo<v8::Value> const&) at tcp_wrap.cc:137)
And we can see how this function is called by f(info)
.
Notice the last line where we create a new TCPWrap and that the instance is not saved or returned.
TCPWrap::TCPWrap(Environment* env, Local<Object> object, ProviderType provider)
: ConnectionWrap(env, object, provider) {
int r = uv_tcp_init(env->event_loop(), &handle_);
CHECK_EQ(r, 0);
}
TCPWrap has a constructor but no destructor (apart from the default one that is). So this is the logic needed to be performed
for New
. But we also have to look at what ConnectionWrap
's constructor does:
template <typename WrapType, typename UVType>
ConnectionWrap<WrapType, UVType>::ConnectionWrap(Environment* env, Local<Object> object, ProviderType provider)
: LibuvStreamWrap(env, object, reinterpret_cast<uv_stream_t*>(&handle_), provider) {}
Notice that handle_
is from connection_wrap.h
:
UVType handle_;
It does not have any logic apart from delegating to LibuvStreamWrap
's contructor:
LibuvStreamWrap::LibuvStreamWrap(Environment* env, Local<Object> object, uv_stream_t* stream, AsyncWrap::ProviderType provider)
: HandleWrap(env, object, reinterpret_cast<uv_handle_t*>(stream), provider),
StreamBase(env),
stream_(stream) {
}
As we can see LibuvStreamWrap
inherits from HandleWrap
StreamBase
and delegates to those constructors
Let's take a look at HandleWrap's constructor:
HandleWrap::HandleWrap(Environment* env,
Local<Object> object,
uv_handle_t* handle,
AsyncWrap::ProviderType provider)
: AsyncWrap(env, object, provider),
state_(kInitialized),
handle_(handle) {
handle_->data = this;
HandleScope scope(env->isolate());
Wrap(object, this);
env->handle_wrap_queue()->PushBack(this);
}
And we see that it delegates to AsyncWrap
's constructor:
AsyncWrap::AsyncWrap(Environment* env,
Local<Object> object,
ProviderType provider,
double execution_async_id,
bool silent)
: BaseObject(env, object),
provider_type_(provider) {
CHECK_NE(provider, PROVIDER_NONE);
CHECK_GE(object->InternalFieldCount(), 1);
// Shift provider value over to prevent id collision.
persistent().SetWrapperClassId(NODE_ASYNC_ID_OFFSET + provider_type_);
// Use AsyncReset() call to execute the init() callbacks.
AsyncReset(execution_async_id, silent);
}
And AsyncWrap
's constructor delegates to BaseObject:
BaseObject::BaseObject(Environment* env, v8::Local<v8::Object> handle)
: persistent_handle_(env->isolate(), handle),
env_(env) {
CHECK_EQ(false, handle.IsEmpty());
CHECK_GT(handle->InternalFieldCount(), 0);
handle->SetAlignedPointerInInternalField(0, static_cast<void*>(this));
}
Recall that handle
was passed via the TCPWrap constructor using args.This():
(lldb) expr handle
(v8::Local<v8::Object>) $413 = (val_ = 0x00007fff5fbfbcc8)
(lldb) expr info.This()
(v8::Local<v8::Object>) $411 = (val_ = 0x00007fff5fbfbcc8)
(lldb) jlh handle
0x38ec09dd0839: [JS_API_OBJECT_TYPE]
- map: 0x38ec210045a1 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x38ecff033799 <Object map = 0x38ec483199a1>
- elements: 0x38ec11d82251 <FixedArray[0]> [HOLEY_ELEMENTS]
- embedder fields: 1
- properties: 0x38ec09dd0859 <PropertyArray[6]> {
#reading: 0x38ec11d823f1 <false> (data field 0) properties[0]
#owner: 0x38ec11d82201 <null> (data field 1) properties[1]
#onread: 0x38ec11d82201 <null> (data field 2) properties[2]
#onconnection: 0x38ec11d82201 <null> (data field 3) properties[3]
}
- embedder fields = {
0x38ec11d822e1
}
So, have stored this object, the receiver, as a persistent object meaning that this heap allocated object will have a non-local scope (is not tied to the C++ scopes) and must be cleared explicitely with Persistent::Reset.
Next, we are going to store a reference to this
which is the current BaseObject (which is really our TCPWrap instance) and
we can see that we are storing the reference to that instance, it will be attached to the receiver object in the internal
field.
And now the constructors will "bubble" up and first is AsyncWrap's constructor:
persistent().SetWrapperClassId(NODE_ASYNC_ID_OFFSET + provider_type_);
The provider_type_
in our case is:
(lldb) expr provider_type_
(const node::AsyncWrap::ProviderType) $436 = PROVIDER_TCPSERVERWRAP
Now, the wrapper class id is used V8's RetainedObjectInfo
which enables us to provide information
about native objects for heap snapshots.
In Environment::Start
in env.cc
we have:
SetupProcessObject(this, argc, argv, exec_argc, exec_argv);
LoadAsyncWrapperInfo(this);
Lets take a closer look at LoadAsyncWraperInfo
:
void LoadAsyncWrapperInfo(Environment* env) {
HeapProfiler* heap_profiler = env->isolate()->GetHeapProfiler();
#define V(PROVIDER) \
heap_profiler->SetWrapperClassInfoProvider( \
(NODE_ASYNC_ID_OFFSET + AsyncWrap::PROVIDER_ ## PROVIDER), WrapperInfo);
NODE_ASYNC_PROVIDER_TYPES(V)
#undef V
}
And after the preprocessor has expanded that (just showing TCPSERVERWRAP
):
void LoadAsyncWrapperInfo(Environment* env) {
HeapProfiler* heap_profiler = env->isolate()->GetHeapProfiler();
heap_profiler->SetWrapperClassInfoProvider(
(NODE_ASYNC_ID_OFFSET + AsyncWrap::PROVIDER_TCPSERVERWRAP), WrapperInfo);
....
}
This binds the callback WrapperInfo
to the class id:
RetainedObjectInfo* WrapperInfo(uint16_t class_id, Local<Value> wrapper) {
// No class_id should be the provider type of NONE.
CHECK_GT(class_id, NODE_ASYNC_ID_OFFSET);
// And make sure the class_id doesn't extend past the last provider.
CHECK_LE(class_id - NODE_ASYNC_ID_OFFSET, AsyncWrap::PROVIDERS_LENGTH);
CHECK(wrapper->IsObject());
CHECK(!wrapper.IsEmpty());
Local<Object> object = wrapper.As<Object>();
CHECK_GT(object->InternalFieldCount(), 0);
AsyncWrap* wrap;
ASSIGN_OR_RETURN_UNWRAP(&wrap, object, nullptr);
return new RetainedAsyncInfo(class_id, wrap);
}
WrapperInfo
will be called by HeapProfiler::ExecuteWrapperClassCallback
which is called
by VisitSubtreeWrapper
in `deps/v8/src/profiler/heap-snapshot-generator.cc'
If we start node with --track-heap-objects
heap tracking will be enabled. And we can use
heapdump.writeSnapshot()
from module heapdump
to generate a heap dump:
const heapdump = require('heapdump');
...
const server = tls.Server(options, function(socket) {
console.log('Connected : ', socket);
heapdump.writeSnapshot();
});
$ lldb -- out/Debug/node --track-heap-objects ../scripts/tls-server.js
(lldb) br s -f env.cc -l 117
(lldb) r
$ out/Debug/node ../scripts/tls-client.js
And we will be able to see that the callback WrapperInfo
is called is called.
A snapshot can be taken by calling:
(lldb) expr reinterpret_cast<v8::internal::HeapProfiler*>(env->isolate()->GetHeapProfiler())->TakeSnapshot(nullptr, nullptr)
So, now we understand what the setting of class id is used for. Lets move on. The next constructor is HandleWrap
's.
A libuv uv_handle_t is a base type for all libuv handle types and HandleWrap wraps
one of these.
HandleWrap::HandleWrap(Environment* env,
Local<Object> object,
uv_handle_t* handle,
AsyncWrap::ProviderType provider)
: AsyncWrap(env, object, provider),
state_(kInitialized),
handle_(handle) {
handle_->data = this;
HandleScope scope(env->isolate());
env->handle_wrap_queue()->PushBack(this);
}
Private members:
ListNode<HandleWrap> handle_wrap_queue_;
enum { kInitialized, kClosing, kClosingWithCallback, kClosed } state_;
uv_handle_t* const handle_;
Also notice handle_wrap_queue_
which will be constructed for each new instance. We can find the constructor for NodeList in
util-inl.h
:
ListNode<T>::ListNode() : prev_(this), next_(this) {}
Just to be clear what this
is:
(lldb) expr *this
(node::ListNode<node::HandleWrap>) $555 = {
prev_ = 0x0000000104fb9730
next_ = 0x0000000104fb9730
}
So initially prev_
and next_
point to this
.
Next, back in HandleWrap's constructor state_ is set to kInitilized and
Notice that HandleWrap
has a private member named handle_
. BaseObject' constructor takes a parameter named handle_
but they are of
different types. I've found this confusing sometimes when stepping through code and the especially the inheritance chain for wrap
objects (like TCPWrap for example).
The handle wrap queue is used in env.h:
typedef ListHead<HandleWrap, &HandleWrap::handle_wrap_queue_> HandleWrapQueue;
typedef ListHead<ReqWrap<uv_req_t>, &ReqWrap<uv_req_t>::req_wrap_queue_>
ReqWrapQueue;
inline HandleWrapQueue* handle_wrap_queue() { return &handle_wrap_queue_; }
inline ReqWrapQueue* req_wrap_queue() { return &req_wrap_queue_; }
Notice handle_wrap_queue_
is of type HandleWrapQueue*
and that ListHead is a template class:
template <typename T, ListNode<T> (T::*M)>
class ListHead {
...
So every time a handle is created, like what happens when a new TCPWrap instance is created it will be added
the the handle queue which contains the active handles. This can be inspected by calling get_activeHandles
env->SetMethod(process, "_getActiveRequests", GetActiveRequests);
env->SetMethod(process, "_getActiveHandles", GetActiveHandles);
Also so in HandleWrap's constructor we have:
handle_->data = this;
This is setting the data member of the handle_ struct to this. This allows this instance to be retreived in libuv callbacks. For example, if we take a look at connection_wrap.cc's OnConnection we can see that this is retreived:
WrapType* wrap_data = static_cast<WrapType*>(handle->data);
When we have seen that calling tls.Server(opions, callback) will find its way to TCPWrap::New which will call TCPWrap::TCPWrap:
int r = uv_tcp_init(env->event_loop(), &handle_);
Lets take a look atht the handle_ after uv_tcp_init
(lldb) expr handle_
(uv_tcp_s) $52 = {
data = 0x0000000104eebbd0
loop = 0x0000000102c72300
type = UV_TCP
close_cb = 0x0000000000000000
handle_queue = ([0] = 0x0000000102c72310, [1] = 0x0000000104bab578)
u = {
fd = 33066
reserved = ([0] = 0x000000000000812a, [1] = 0x0000000000000000, [2] = 0x0000000000000000, [3] = 0x0000000000000002)
}
next_closing = 0x0000000000000000
flags = 8192
write_queue_size = 0
alloc_cb = 0x0000000000000000
read_cb = 0x0000000000000000
connect_req = 0x0000000000000000
shutdown_req = 0x0000000000000000
io_watcher = {
cb = 0x0000000101983450 (node`uv__stream_io at stream.c:1297)
pending_queue = ([0] = 0x0000000104eebcf8, [1] = 0x0000000104eebcf8)
watcher_queue = ([0] = 0x0000000104eebd08, [1] = 0x0000000104eebd08)
pevents = 0
events = 0
fd = -1
rcount = 0
wcount = 0
}
write_queue = ([0] = 0x0000000104eebd30, [1] = 0x0000000104eebd30)
write_completed_queue = ([0] = 0x0000000104eebd40, [1] = 0x0000000104eebd40)
connection_cb = 0x0000000000000000
delayed_error = 0
accepted_fd = -1
queued_fds = 0x0000000000000000
select = 0x0000000000000000
}
(lldb) expr &handle_
(uv_tcp_s *) $53 = 0x0000000104eebc68
And lets put a break point in connection_wrap.cc and inspect the handle passed to that function:
(lldb) expr handle
(uv_stream_t *) $54 = 0x0000000104eebc68
As described above the libuv handle will have be the same and the data was set to TCPWrap by HandleWrap's constructor.
WrapType* wrap_data = static_cast<WrapType*>(handle->data);
(lldb) expr wrap_data
(node::TCPWrap *) $66 = 0x00000001061e3d90
Next we have
Local<Object> client_obj = WrapType::Instantiate(env,
wrap_data,
WrapType::SOCKET);
This will call TCPWrap::Instantiate
in our case. Just note that there was one TCPwrap instance for the server's
listen setup and not there is one for the connection.
This function will later make a callback to onconnection
in lib/net.js.
While stepping through this call I came accross isLegalPort
in internal/net.js
:
function isLegalPort(port) {
if ((typeof port !== 'number' && typeof port !== 'string') ||
(typeof port === 'string' && port.trim().length === 0))
return false;
return +port === (+port >>> 0) && port <= 0xFFFF;
}
So is port is negative we make it positive and then check if is an unsigned int:
+port === (+port >>> 0)
And the check that the port is not great than 65535:
&& port <= 65535
net.js
will emit an onconnect
event and pass the socket. _tls_wrap
will be listening for this event and the listener
function is named tlsConnectionListener
. This listener is added by the following call:
// constructor call
net.Server.call(this, tlsConnectionListener);
This will call the server constructor in net.js
:
function Server(options, connectionListener) {
if (!(this instanceof Server))
return new Server(options, connectionListener);
EventEmitter.call(this);
if (typeof options === 'function') {
connectionListener = options;
options = {};
this.on('connection', connectionListener);
} else if (options == null || typeof options === 'object') {
options = options || {};
if (typeof connectionListener === 'function') {
this.on('connection', connectionListener);
}
} else {
throw new ERR_INVALID_ARG_TYPE('options', 'Object', options);
}
Notice this.on('connection', connectionListener)
which is where the listener is registered.
So, tlsConnectionListener will
const socket = new TLSSocket(rawSocket, {
secureContext: this._sharedCreds,
isServer: true,
server: this,
requestCert: this.requestCert,
rejectUnauthorized: this.rejectUnauthorized,
handshakeTimeout: this[kHandshakeTimeout],
ALPNProtocols: this.ALPNProtocols,
SNICallback: this[kSNICallback] || SNICallback
});
Notice that there is a default SNICallback.
I noticed that lib/tls.js
requires internal/tls
which only has one function named parseCertString, and I'm
wondering why this is not in internal/crypto/util or something? It is deprecated and will be removed later.
Duplex constructor
I found that in the Duplex constructor we have the following code:
if (options && options.readable === false)
this.readable = false;
if (options && options.writable === false)
this.writable = false;
this.allowHalfOpen = true;
if (options && options.allowHalfOpen === false) {
this.allowHalfOpen = false;
this.once('end', onend);
}
Notice that options is checked for undefined in all three if statements. My first reaction was to fix this but then I wondered if v8 could optimize this. How do I check that? We can see the generated bytecode for a function using `--print_bytecode
[generated bytecode for function: beve]
Parameter count 2
Frame size 24
75 E> 0x2e8ae77c5f1a @ 0 : 95 StackCheck
89 S> 0x2e8ae77c5f1b @ 1 : 1c 02 Ldar a0
0x2e8ae77c5f1d @ 3 : 87 23 JumpIfToBooleanFalse [35] (0x2e8ae77c5f40 @ 38) 112 E> 0x2e8ae77c5f1f @ 5 : 1f 02 00 00 LdaNamedProperty a0, [0], [0]
0x2e8ae77c5f23 @ 9 : 1d fb Star r0
0x2e8ae77c5f25 @ 11 : 09 01 LdaConstant [1]
117 E> 0x2e8ae77c5f27 @ 13 : 5b fb 02 TestEqualStrict r0, [2]
0x2e8ae77c5f2a @ 16 : 89 16 JumpIfFalse [22] (0x2e8ae77c5f40 @ 38)
137 S> 0x2e8ae77c5f2c @ 18 : 0a 02 03 LdaGlobal [2], [3]
0x2e8ae77c5f2f @ 21 : 1d fa Star r1
145 E> 0x2e8ae77c5f31 @ 23 : 1f fa 03 05 LdaNamedProperty r1, [3], [5]
0x2e8ae77c5f35 @ 27 : 1d fb Star r0
0x2e8ae77c5f37 @ 29 : 09 04 LdaConstant [4]
0x2e8ae77c5f39 @ 31 : 1d f9 Star r2
145 E> 0x2e8ae77c5f3b @ 33 : 4d fb fa f9 07 CallProperty1 r0, r1, r2, [7]
168 S> 0x2e8ae77c5f40 @ 38 : 1c 02 Ldar a0
0x2e8ae77c5f42 @ 40 : 87 23 JumpIfToBooleanFalse [35] (0x2e8ae77c5f65 @ 75)
191 E> 0x2e8ae77c5f44 @ 42 : 1f 02 05 09 LdaNamedProperty a0, [5], [9]
0x2e8ae77c5f48 @ 46 : 1d fb Star r0
0x2e8ae77c5f4a @ 48 : 03 20 LdaSmi [32]
195 E> 0x2e8ae77c5f4c @ 50 : 5b fb 0b TestEqualStrict r0, [11]
0x2e8ae77c5f4f @ 53 : 89 16 JumpIfFalse [22] (0x2e8ae77c5f65 @ 75)
209 S> 0x2e8ae77c5f51 @ 55 : 0a 02 03 LdaGlobal [2], [3]
0x2e8ae77c5f54 @ 58 : 1d fa Star r1
217 E> 0x2e8ae77c5f56 @ 60 : 1f fa 03 0c LdaNamedProperty r1, [3], [12]
0x2e8ae77c5f5a @ 64 : 1d fb Star r0
0x2e8ae77c5f5c @ 66 : 09 06 LdaConstant [6]
0x2e8ae77c5f5e @ 68 : 1d f9 Star r2
217 E> 0x2e8ae77c5f60 @ 70 : 4d fb fa f9 0e CallProperty1 r0, r1, r2, [14]
0x2e8ae77c5f65 @ 75 : 04 LdaUndefined
237 S> 0x2e8ae77c5f66 @ 76 : 99 Return
Constant pool (size = 7)
0x2e8ae77c5e69: [FixedArray] in OldSpace
- map: 0x2e8a12202341 <Map(HOLEY_ELEMENTS)>
- length: 7
0: 0x2e8ad922c8e1 <String[4]: name>
1: 0x2e8ae77c5db9 <String[6]: Daniel>
2: 0x2e8ad92236b9 <String[7]: console>
3: 0x2e8ad9206421 <String[3]: log>
4: 0x2e8ae77c5dd9 <String[8]: got name>
5: 0x2e8ae77c4d59 <String[3]: age>
6: 0x2e8ae77c5df9 <String[7]: got age>
ModuleWrap
To look into this a little closer we can step through the following:
$ lldb -- out/Debug/node --expose-internals --inspect-brk ../scripts/module_wrap.js
(lldb) br s -n ModuleWrap::Initialize
(lldb) br s -n ModuleWrap::New
(lldb) r
So when we call:
const mod = new ModuleWrap('export * from "bar"; 6;', 'foo');
This will break in ModuleWrap::New.
(lldb) jlh source_text
#export * from "bar"; 6;
(lldb) jlh url
#foo
You can see the functions avaiable to mod:
Now ModuleWrap is also used by lib/internal/vm/module.js
:
const {
ModuleWrap,
kUninstantiated,
kInstantiating,
kInstantiated,
kEvaluating,
kEvaluated,
kErrored,
} = internalBinding('module_wrap');
The Module class's constructor creates an instance of ModuleWrap:
const wrap = new ModuleWrap(src, url, context, lineOffset, columnOffset);
Lets take a look at an example of using the new modules:
const vm = require('vm');
const contextifiedSandbox = vm.createContext({ secret: 42 });
(async () => {
const bar = new vm.Module(`
import s from 'foo';
s;
`, { context: contextifiedSandbox });
async function linker(specifier, referencingModule) {
if (specifier === 'foo') {
return new vm.Module(`
// The "secret" variable refers to the global variable we added to
// "contextifiedSandbox" when creating the context.
export default secret;
`, { context: referencingModule.context });
}
throw new Error(`Unable to resolve dependency: ${specifier}`);
}
await bar.link(linker);
bar.instantiate();
const { result } = await bar.evaluate();
console.log(result);
// Prints 42.
})();
First a contextifiedSandbox
is created by calling vm.createContext
.
function createContext(sandbox = {}, options = {}) {
if (isContext(sandbox)) {
return sandbox;
}
The above isContext
will call a javascript function that checks the input parmeter sandbox and then
will call _isContext
which will call the c++ function Contextify::IsContext in node_contextify:
void ContextifyContext::IsContext(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CHECK(args[0]->IsObject());
Local<Object> sandbox = args[0].As<Object>();
Maybe<bool> result =
sandbox->HasPrivate(env->context(),
env->contextify_context_private_symbol());
args.GetReturnValue().Set(result.FromJust());
}
We can inspect the sandbox
object using:
(lldb) jlh sandbox
0x3222865311d1: [JS_OBJECT_TYPE] in OldSpace
- map: 0x3222f9466981 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x3222aca04479 <Object map = 0x3222f94022a1>
- elements: 0x322215182251 <FixedArray[0]> [HOLEY_ELEMENTS]
- properties: 0x322215182251 <FixedArray[0]> {
#secret: 42 (data field 0)
}
Next, we check this object to see if it has any private. Private is a symbol which is not converted
to a string and exposed by Object.getOwnPropertyNames. Only using the symbol reference can one set
and retrieve values from the object. A list of assigned symbols for a given object can still be
accessed with the Object.getOwnPropertySymbols function.
So in this case we are checking if there is a symbol named node:contextify:context
was added to
this sandbox object, something like this in javascript (but done in c++):
var s = Symbol('node:contextify:context');
sandbox[s] = true;
We can see that the passed in sandbox instance does not have it set:
lldb) expr result
(v8::Maybe<bool>) $10 = (has_value_ = true, value_ = false)
After this createContext
will setup and check the options passed and the call createContext
:
makeContext(sandbox, name, origin, strings, wasm);
(lldb) jlh env->contextify_context_private_symbol()
0x3222aca022c9: [Symbol] in OldSpace
- map: 0x322236e82701 <Map(HOLEY_ELEMENTS)>
- hash: 686904622
- name: 0x3222aca02299 <String[23]: node:contextify:context>
- private: 1
ContextifyContext::MakeContext
will check the options passed in and then create a ContextOptions with
them which is the passed to the ContextifyConext
constructor:
ContextifyContext* context = new ContextifyContext(env, sandbox, options);
ContextifyContext will call ContextifyContext::CreateV8Context
This context will then be using a private symbol on the sandbox object:
sandbox->SetPrivate(
env->context(),
env->contextify_context_private_symbol(),
External::New(env->isolate(), context));
Breaking in bootstraper code
Previsouly I was able to run a script using --inspect-brk
and then afterwards set a break point anywhere in node's bootstrap files, then rerun and the debugger would break
there. This does not seem to work for my anymore, but what does work is setting a debugger;
line in the code, so you can stick that in one of the javascript files in
lib/internal/bootstrap
and you should be able to break in them.
WebAssembly (WASM)
The text format for wasm is of type S-expressions where the first label inside a parentheses tell what kind of node it is:
(module (memory 1) (func))
The above has a root node named module
and two child nodes, memory
and func
.
All code is grouped into functions:
(func <signature> <locals> <body>)
The signature declares the functions parameters and its return type. The locals are local variables to the function The body is a list of instructions for the fuction.
(module
(func $add (param $first i32) (param $second i32) (result i32)
get_local $first
get_local $second
(i32.add)
)
(export "add" (func $add))
)
Notice that a wasm "program" is simply named a module as the intention is to have it included and run by another program.
The body is stack based so get_local
will push $first onto the stack. i32.add
will take two values from the stack, add then and push
the result onto the stack.
Notice the $add
in the function. This is much like the parameters that are index based but can be named to make the code
clearer. So we could just as well written:
(export "add" (func 0))
export is a function that makes the function available using the name add
in our case.
You can compile the above .wat file to wasm using wabt:
$ out/clang/Debug/wat2wasm ~/work/nodejs/scripts/wasm-helloworld.wat -o helloworld.wasm
And the use the wasm from javascript:
const fs = require('fs');
const buffer = fs.readFileSync('helloworld.wasm');
const promise = WebAssembly.instantiate(buffer, {});
promise.then((result) => {
const instance = result.instance;
const module = result.module;
console.log('instance.exports:', instance.exports);
const addTwo = instance.exports.addTwo;
console.log(addTwo(1, 2));
});
Lets take a closer look at the WebAssembly API.
`WebAssembly` is the how the api is exposed.
WebAssembly.instantiate:
compiles and instanciates wasm code and returns both an object with two
members `module` and `instance`.
`WebAssembly.Memory` is used to deal with more complex objects like strings. Is just a large array of bytes which can grow. You can read/write
using i32.load and i32.store.
Memory is specified using WebAssembly.Memory{}:
```javascript
const memory = new WebAssembly.Memory({initial:10, maximum:100});
10
and 100
are specified in pages which are fixed to 64KiB. So here we are saying that we want an initial size of 640KiB.
`WebAssembly` is a builtin object in V8 (I think, still have to verify this).
To inspect the .wasm you can use wasm-objdump:
```console
$ wasm-objdump -x src/add.wasm
add.wasm:file format wasm 0x1
Section Details:
Type:
- type[0] (i32, i32) -> i32
Function:
- func[0] sig=0 <add>
- func[1] sig=0 <addTwo>
Export:
- func[0] <add> -> "add"
- func[1] <addTwo> -> "addTwo"
When
const buffer = fs.readFileSync('../scripts/helloworld.wasm');
WebAssembly.validate(buffer);
From reading some docs/spec I read that validate compiles the wasm, remember that it is in
binary format and in our case we transformed it from text format (wat) to binary format (wasm).
This set is compiling and returns true or false depending on if it was successful. TODO: where is this done in V8?
WebAssembly.validate(buffer)
is a builtin
src/wasm/wasm-objects.h
:
class WasmInstanceObject : public JSObject {
public:
DECL_CAST(WasmInstanceObject)
...
// Layout description.
#define WASM_INSTANCE_OBJECT_FIELDS(V) \
V(kCompiledModuleOffset, kPointerSize) \
V(kExportsObjectOffset, kPointerSize) \
V(kMemoryObjectOffset, kPointerSize) \
V(kGlobalsBufferOffset, kPointerSize) \
V(kDebugInfoOffset, kPointerSize) \
V(kTableObjectOffset, kPointerSize) \
V(kFunctionTablesOffset, kPointerSize) \
V(kImportedFunctionInstancesOffset, kPointerSize) \
V(kImportedFunctionCallablesOffset, kPointerSize) \
V(kIndirectFunctionTableInstancesOffset, kPointerSize) \
V(kManagedNativeAllocationsOffset, kPointerSize) \
V(kManagedIndirectPatcherOffset, kPointerSize) \
V(kFirstUntaggedOffset, 0) /* marker */ \
V(kMemoryStartOffset, kPointerSize) /* untagged */ \
V(kMemorySizeOffset, kUInt32Size) /* untagged */ \
V(kMemoryMaskOffset, kUInt32Size) /* untagged */ \
V(kImportedFunctionTargetsOffset, kPointerSize) /* untagged */ \
V(kGlobalsStartOffset, kPointerSize) /* untagged */ \
V(kIndirectFunctionTableSigIdsOffset, kPointerSize) /* untagged */ \
V(kIndirectFunctionTableTargetsOffset, kPointerSize) /* untagged */ \
V(kIndirectFunctionTableSizeOffset, kUInt32Size) /* untagged */ \
V(k64BitArchPaddingOffset, kPointerSize - kUInt32Size) /* padding */ \
V(kSize, 0)
DEFINE_FIELD_OFFSET_CONSTANTS(JSObject::kHeaderSize,
WASM_INSTANCE_OBJECT_FIELDS)
#undef WASM_INSTANCE_OBJECT_FIELDS
#define DEFINE_FIELD_OFFSET_CONSTANTS(StartOffset, LIST_MACRO) \
enum { \
LIST_MACRO##_StartOffset = StartOffset - 1, \
LIST_MACRO(DEFINE_ONE_FIELD_OFFSET) \
};
Lets take a look at the what one for these expand to:
enum {
WASM_INSTANCE_OBJECT_FIELDS_StartOffset = JSObject::kHeaderSize - 1,
kCompiledModuleOffset, kCompiledModuleOffsetEnd = kCompiledModuleOffset + kPointerSize - 1,
ExportsObjectOffset, ExportsObjectOffset = kCompiledModuleOffset + kPointerSize - 1,
}
V(kCompiledModuleOffset, kPointerSize) \
Node native wasm modules
The idea is to allow wasm to be loaded natively, similar to a native module, without having to go view the JavaScript API.
Currently, to do any system calls in WASM the options we have are to call out to javascript. This would be done using an import in wasm:
(module
(func $imported (import "imports" "imported_func") (param i32))
...
In this case imported_func
will have to passed to the wasm module by using the importObject:
var importObject = {
imports: {
imported_func: function(nr) {
console.log(nr);
}
}
};
Now, as we can see imported_func
is implemented in JavaScript but it could have been a native binding function written in C++ too.
What we are trying to accomplish is to be able to specify an import function what is not passed in from outside, but instead node will try to bind the name to a native function by using the import name in the wasm. For this we would need to be able to register some kind of callback with V8 does it's look up on the importsObject. For example:
(module
(func (import "fopen" "nodejs") (param i32) (param i32))
)
$ lldb -- out/Debug/node --trace-wasm-compiler test/parallel/test-wasm-builtin.js
(lldb) settings set target.non-stop-mode true
(lldb) br s -n LookupImport
Stepping through the code I think that perhaps if there was a way to have a callback for InstanceBuilder::LookupImport
we should be able to check the module_name
is equal to nodejs
and then do something.
If we take a look at InstantiateToInstanceObject
in deps/v8/src/wasm/module-compiler
:
InstanceBuilder builder(isolate, thrower, module_object, imports, memory);
auto instance = builder.Build();
Build
will build a WasmInstanceeObject. Notice imports' which will be set as
ffi_` in the builder instance.
In our case the imports object could be null.
(lldb) expr imports.ToHandleChecked()->Print()
0x312eb5a75191: [JS_OBJECT_TYPE]
- map: 0x312eb3ad6f79 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x312ed17043a1 <Object map = 0x312eb3a822b1>
- elements: 0x312eef982251 <FixedArray[0]> [HOLEY_ELEMENTS]
- properties: 0x312eef982251 <FixedArray[0]> {
#imports: 0x312eb5a751e9 <Object map = 0x312eb3ad6fd1> (data field 0)
}
But this is because we have passed in an importObject
with a property named imports
. What happens if we don't have
pass in an imports object?
There will be an exception in this case. Just for testing and investigation I've added a callback on the V8 isolate
to allow an empty import object to be passed.
isolate->SetAllowWasmEmptyImportObjectCallback(AllowWasmEmptyImportObjectCallback);
With this implace we get further but run into the following:
# Fatal error in ../deps/v8/src/wasm/module-compiler.cc, line 1738
# Debug check failed: !ffi_.is_null().
#
#
#
#FailureMessage Object: 0x7fff5fbf9f40Illegal instruction: 4
Lets back up a little. When we call WebAssembly.validate(buffer)
the function that we land in
is WebAssemblyInstantiateCallback
(deps/v8/src/wasm/wasm-js.cc
):
(lldb) br s -n WebAssemblyInstantiateCallback
Local<Value> module = args[0];
Intersting that the module is passed in to this function:
(lldb) jlh module
0x3137f054b099: [WASM_MODULE_TYPE]
- map: 0x313775888871 <Map(HOLEY_ELEMENTS)> [FastProperties]
- prototype: 0x3137d4eee691 <Object map = 0x313775888ce9>
- elements: 0x31375c282251 <FixedArray[0]> [HOLEY_ELEMENTS]
- properties: 0x31375c282251 <FixedArray[0]> {}
Next, we have:
Local<Value> instance;
if (!WebAssemblyInstantiateImpl(isolate, module, args.Data()).ToLocal(&instance)) {
return;
}
WebAssemblyInstantiateImpl
does:
instance_object = i_isolate->wasm_engine()->SyncInstantiate(
i_isolate, &thrower, i::Handle<i::WasmModuleObject>::cast(module_obj),
maybe_imports, i::MaybeHandle<i::JSArrayBuffer>());
SyncInstantiate
can be found in deps/v8/src/wasm/wasm-engine.cc
which delegates to InstantiateToInstanceObject
in deps/v8/src/wasm/module-compiler.cc
.
InstanceBuilder builder(isolate, thrower, module_object, imports, memory);
auto instance = builder.Build();
In build()
we find the check we added:
if (!module_->import_table.empty() && ffi_.is_null() &&
!isolate_->allow_wasm_empty_import_object_callback()) {
thrower_->TypeError( "Imports argument must be present and must be an object");
return {};
So we will then proceed with SanitizeImports()
:
for (size_t index = 0; index < module_->import_table.size(); ++index) {
WasmImport& import = module_->import_table[index];
(lldb) frame var module_->import_table[0]
(v8::internal::wasm::WasmImport) module_->import_table[0] = {
module_name = (offset_ = 25, length_ = 5)
field_name = (offset_ = 31, length_ = 6)
kind = kExternalFunction
index = 0
}
(lldb) frame var module_->import_table[0].module_name
(v8::internal::wasm::WireBytesRef) module_->import_table[0].module_name = (offset_ = 25, length_ = 5)
(lldb) frame var module_->import_table[0].field_name
(v8::internal::wasm::WireBytesRef) module_->import_table[0].field_name = (offset_ = 31, length_ = 6)
Lets take a look at the module_
:
(lldb) expr *module_
(v8::internal::wasm::WasmModule) $40 = {
signature_zone = {
__ptr_ = {
std::__1::__libcpp_compressed_pair_imp<v8::internal::Zone *, std::__1::default_delete<v8::internal::Zone>, 2> = {
__first_ = 0x0000000105a02060
}
}
}
initial_pages = 0
maximum_pages = 0
has_shared_memory = false
has_maximum_pages = false
has_memory = false
mem_export = false
start_function_index = -1
globals = size=0 {}
globals_size = 0
num_imported_functions = 1
num_declared_functions = 1
num_exported_functions = 1
name = (offset_ = 0, length_ = 0)
signatures = size=2 {
[0] = 0x0000000107006a20
[1] = 0x0000000107006a40
}
signature_ids = size=2 {
[0] = 0
[1] = 1
}
functions = size=2 {
[0] = {
sig = 0x0000000107006a20
func_index = 0
sig_index = 0
code = (offset_ = 0, length_ = 0)
imported = true
exported = false
}
[1] = {
sig = 0x0000000107006a40
func_index = 1
sig_index = 1
code = (offset_ = 58, length_ = 8)
imported = false
exported = true
}
}
data_segments = size=0 {}
function_tables = size=0 {}
import_table = size=1 {
[0] = {
module_name = (offset_ = 25, length_ = 5)
field_name = (offset_ = 31, length_ = 6)
kind = kExternalFunction
index = 0
}
}
export_table = size=1 {
[0] = {
name = (offset_ = 47, length_ = 5)
kind = kExternalFunction
index = 1
}
}
exceptions = size=0 {}
table_inits = size=0 {}
signature_map = {
next_ = 2
frozen_ = true
map_ = size=2 {
[0] = {
first = 0x0000000107006a40
second = 1
}
[1] = {
first = 0x0000000107006a20
second = 0
}
}
}
origin_ = kWasmOrigin
names_ = {
__ptr_ = {
std::__1::__libcpp_compressed_pair_imp<std::__1::unordered_map<unsigned int, v8::internal::wasm::WireBytesRef, std::__1::hash<unsigned int>, std::__1::equal_to<unsigned int>, std::__1::allocator<std::__1::pair<const unsigned int, v8::internal::wasm::WireBytesRef> > > *, std::__1::default_delete<std::__1::unordered_map<unsigned int, v8::internal::wasm::WireBytesRef, std::__1::hash<unsigned int>, std::__1::equal_to<unsigned int>, std::__1::allocator<std::__1::pair<const unsigned int, v8::internal::wasm::WireBytesRef> > > >, 2> = {
__first_ = 0x0000000105904ab0 size=0
}
}
}
}
Next we have:
MaybeHandle<Object> result = module_->is_asm_js()
? LookupImportAsm(int_index, import_name)
: LookupImport(int_index, module_name, import_name);
This will land us in
MaybeHandle<Object> InstanceBuilder::LookupImport(uint32_t index,
Handle<String> module_name,
Handle<String> import_name) {
(lldb) job *module_name
"fopen"
(lldb) job *import_name
"nodejs"
DCHECK(!ffi_.is_null());
This will then abort (I'm using a debug build of V8).
The following is the code that retrieves the module from the import.
result = Object::GetPropertyOrElement(ffi_.ToHandleChecked(), module_name);
ffi_
is the importObject passed in to validate, and module_name is the name of the property
to look up. In our case this would be fopen
.
(lldb) job *name
"fopen"
How about we pass the ffi_
to the callback and it can add the object to it, and if the import object
is null then create a new instance?
So we would add the ffi as a parameter to our callback. This leads me to an issue since ffi_ is of type
MaybeHandle<Receiver> which is an internal type. How to I pass this to the callback propertly without violating
the API. Would it be alright to consider this an extension of the internal V8 API and there for it might
be alright to include src/objects.h
in the callbacks source file?
The proposal for this can be found in this branch
There are callback such as:
isolate->SetWasmModuleCallback(NodeWasm::WasmModuleCallback);
These callbacks will be called by WebAssemblyModule
and WebAssemblyInstance
in deps/v8/src/wasm/wasm-js.cc
.
Those functions are called when:
const fs = require('fs');
const buffer = fs.readFileSync('import.wasm');
WebAssembly.validate(buffer);
const m = new WebAssembly.Module(buffer);
var importObject = {
imports: {
imported_func: arg => console.log('imported_func:', arg)
}
};
const instance = new WebAssembly.Instance(m, importObject);
console.log(instance.exports.exported_func());
const m = new WebAssembly.Module(buffer);
In this case both of the callback will be called. But if we use WebAssembly.instantiate
the won't get
called:
WebAssembly.instantiate(buffer).then((results) => {
const fd = results.instance.exports.fopen();
assert.strictEqual(fd, 22);
});
v8::Object vs v8::internal::Object
v8::Object is part of the public api declared in include/v8.h. v8::internal::Object is declared in the interal api in src/. It look like ToApiHandle does what I'm looking for. But what exactly does it do?
n-api
This is a C API to allow for ABI compatability between Node versions. So a native app does not have to be recompiled to work with a different version of node.
Lets take a look test/addons-napi/1_hello_world/binding.c
as an example:
NAPI_MODULE_INIT() {
napi_property_descriptor desc = DECLARE_NAPI_PROPERTY("hello", Method);
NAPI_CALL(env, napi_define_properties(env, exports, 1, &desc));
return exports;
}
This macro can be found in src/node_api.h:
#define NAPI_MODULE_INIT() \
EXTERN_C_START \
NAPI_MODULE_EXPORT napi_value \
NAPI_MODULE_INITIALIZER(napi_env env, napi_value exports); \
EXTERN_C_END \
NAPI_MODULE(NODE_GYP_MODULE_NAME, NAPI_MODULE_INITIALIZER) \
napi_value NAPI_MODULE_INITIALIZER(napi_env env, \
napi_value exports)
This expands to:
$ clang -E test/addons-napi/1_hello_world/binding.c -Isrc
__attribute__((visibility("default"))) napi_value napi_register_module_v1(napi_env env, napi_value exports);
static napi_module _module =
{ 1,
0,
"test/addons-napi/1_hello_world/binding.c",
napi_register_module_v1,
"NODE_GYP_MODULE_NAME",
((void*)0),
{0},
};
static void _register_NODE_GYP_MODULE_NAME(void) __attribute__((constructor));
static void _register_NODE_GYP_MODULE_NAME(void) {
napi_module_register(&_module);
}
napi_value napi_register_module_v1(napi_env env, napi_value exports) {
napi_property_descriptor desc = { ("hello"), 0, (Method), 0, 0, 0, napi_default, 0 };
do {
if ((napi_define_properties(env, exports, 1, &desc)) != napi_ok) {
do {
const napi_extended_error_info *error_info;
napi_get_last_error_info(((env)), &error_info);
_Bool is_pending;
napi_is_exception_pending(((env)), &is_pending);
if (!is_pending) {
const char* error_message = error_info->error_message != ((void*)0) ? error_info->error_message : "empty error message";
napi_throw_error(((env)), ((void*)0), error_message);
}
} while (0); return ((void*)0);
}
} while (0);
return exports;
wasm c-api
The following repo, git@github.com:rossberg/wasm-c-api.git a c api to allow you to use function defined in wasm from C/C++.
Make sure you configure V8 to have the following configuration options:
$ gn args out.gn/x64.release/
is_debug = false
target_cpu = "x64"
is_component_build = false
v8_static_library = true
V8 is quite large and I looks like wasm-c-api expects v8 to be cloned in the same directory. I just updated the Makefile to allow the V8 dir to be configured to allow building using:
$ make V8_DIR="/Users/danielbevenius/work/google/javascript" CFLAGS="-g"
Lets take a look at the example in example/hello.c
:
int main(int argc, const char* argv[]) {
// Initialize.
printf("Initializing...\n");
wasm_init(argc, argv);
wasm_store_t* store = wasm_store_new();
....
}
One thing to notice is that there are no dependencies to any JavaScript engine. `wasm.h` is the only
included header (apart from standard headers).
`wasm_store_t` represents the [store](https://webassembly.github.io/spec/core/exec/runtime.html#store) in
the spec (I think) which
What does wasm_init
do?
This function can be found in src/wasm-v8.cc
:
void wasm_init(int argc, const char *const argv[]) {
wasm_init_with_config(argc, argv, nullptr);
}
wasm_init_with_config
:
void wasm_init_with_config(int argc, const char *const argv[], wasm_config_t* config) {
v8::V8::InitializeExternalStartupData(argv[0]);
static std::unique_ptr<v8::Platform> platform = v8::platform::NewDefaultPlatform();
v8::V8::InitializePlatform(platform.get());
v8::V8::Initialize();
}
We can recogniize this and the V8 initialization/setup. So this will create a V8 environment
to allow execution.
Next, we have wasm_store_new
:
std::unique_ptr<wasm_store_t> store(new wasm_store_t);
if (store.get() == nullptr) return nullptr;
store->create_params_.array_buffer_allocator = v8::ArrayBuffer::Allocator::NewDefaultAllocator();
wasm_store_t
is a class which has the following private members:
v8::Isolate::CreateParams create_params_;
v8::Isolate *isolate_;
v8::Eternal<v8::Context> context_;
v8::Eternal<v8::ObjectTemplate> callback_data_template_;
v8::Eternal<v8::String> strings_[V8_S_COUNT];
v8::Eternal<v8::Function> functions_[V8_F_COUNT];
v8::Eternal<v8::Object> cache_;
Next in wasm_store_new a new Isolate is created:
auto isolate = v8::Isolate::New(store->create_params_);
...
v8::Isolate::Scope isolate_scope(isolate);
v8::HandleScope handle_scope(isolate);
auto context = v8::Context::New(isolate);
if (context.IsEmpty()) return nullptr;
v8::Context::Scope context_scope(context);
auto callback_data_template = v8::ObjectTemplate::New(isolate);
if (callback_data_template.IsEmpty()) return nullptr;
callback_data_template->SetInternalFieldCount(1);
store->isolate_ = isolate;
store->context_ = v8::Eternal<v8::Context>(isolate, context);
store->callback_data_template_ = v8::Eternal<v8::ObjectTemplate>(isolate, callback_data_template);
Next,
static const char* const raw_strings[V8_S_COUNT] = {
"function", "global", "table", "memory",
"module", "name", "kind", "exports",
"i32", "i64", "f32", "f64", "anyref", "anyfunc",
"value", "mutable", "element", "initial", "maximum",
"buffer"
};
for (int i = 0; i < V8_S_COUNT; ++i) {
auto maybe = v8::String::NewFromUtf8(isolate, raw_strings[i], v8::NewStringType::kNormal);
if (maybe.IsEmpty()) return nullptr;
auto string = maybe.ToLocalChecked();
store->strings_[i] = v8::Eternal<v8::String>(isolate, string);
}
The above is creating V8 strings that will persiste for the life time of the isolate.
auto global = context->Global();
auto maybe_wasm_name = v8::String::NewFromUtf8(isolate, "WebAssembly", v8::NewStringType::kNormal);
if (maybe_wasm_name.IsEmpty()) return nullptr;
auto wasm_name = maybe_wasm_name.ToLocalChecked();
auto maybe_wasm = global->Get(context, wasm_name);
if (maybe_wasm.IsEmpty()) return nullptr;
auto wasm = v8::Local<v8::Object>::Cast(maybe_wasm.ToLocalChecked());
Next, wasm
is set to the WebAssembly
builtin (Verify this when debugging):
(lldb) expr wasm
(v8::Local<v8::Object>) $25 = (val_ = 0x00000001028020c8)
Next in main the wasm file is read and loaded.
printf("Creating callbacks...\n");
own wasm_functype_t* print_type1 = wasm_functype_new_1_1(wasm_valtype_new_i32(), wasm_valtype_new_i32());
own wasm_func_t* print_func1 = wasm_func_new(store, print_type1, print_wasm);
Notice that print_wasm
is a function in hello.c.
SetClassId
./src/async_wrap.cc:611:29: warning: 'SetWrapperClassInfoProvider' is deprecated [-Wdeprecated-declarations]
NODE_ASYNC_PROVIDER_TYPES(V)
In deps/v8/include/v8-profiler.h
:
V8_DEPRECATED(
"Use SetBuildEmbedderGraphCallback to provide info about embedder nodes",
void SetWrapperClassInfoProvider(uint16_t class_id,
WrapperInfoCallback callback));
In src/async_wrap.cc we have:
void LoadAsyncWrapperInfo(Environment* env) {
HeapProfiler* heap_profiler = env->isolate()->GetHeapProfiler();
#define V(PROVIDER) \
heap_profiler->SetWrapperClassInfoProvider( \
(NODE_ASYNC_ID_OFFSET + AsyncWrap::PROVIDER_ ## PROVIDER), WrapperInfo);
NODE_ASYNC_PROVIDER_TYPES(V)
#undef V
}
`LoadAsyncWrapperInfo` is called from Environment::Start:
```c++
SetupProcessObject(this, argc, argv, exec_argc, exec_argv);
LoadAsyncWrapperInfo(this);
So the preprocessor will generate calls for all the NODE_ASYNC_PROVIDER_TYPES:
heap_profiler->SetWrapperClassInfoProvider(NODE_ASYNC_ID_OFFSET + AsyncWrap::PROVIDER_TCP, WrapperInfo);
Lets figure out how this works and add a break point:
t
EmbedderGraph
is a graph that contains node object (embedder) and V8 objects
CodeCache
There is a new feature in node where a code cache can be utilized for builtins. Note that these are the builtins that node provides, that is the ones located in the lib directory.
Just to keep things clear, there is an action named js2c which transforms the
JavaScript source code in the lib
directory into C byte arrays. These are then
available in the Node.js binary. The output of the js2c action is a c++
file named out/Release/obj/gen/node_javascript.cc
. There was previously a
placeholder for this in src/node_javascript.cc
but not anymore.
So when is LoadJavaScriptSource()
called?
There is a GYP action named mkcodecache which is a dependency on the node target.
This is an executable (source can be found in tools/code_cache/mkcodecache.cc).
mkcodecache includes cache_builder.h and cache_builder.cc includes node_native_module.h,
which in turn has a global variable in node_native_module.cc:
NativeModuleLoader NativeModuleLoader::instance_;
NativeModuleLoader's constructor calls LoadJavaScriptSource() which is how the source_ variable is populated using the byte arrays generated by j2sc. This will happen in the initialisation stage before the main function has been entered.
This is actually called before entering the main function of mkcodecache.
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1
* frame #0: 0x000000010039bf01 mkcodecache`node::native_module::NativeModuleLoader::NativeModuleLoader(this=0x00000001017bac78) at node_native_module.cc:25 [opt]
frame #1: 0x000000010039dd5f mkcodecache`_GLOBAL__sub_I_node_native_module.cc [inlined] node::native_module::NativeModuleLoader::NativeModuleLoader(this=0x00000001017bac78) at node_native_module.cc:24 [opt]
frame #2: 0x000000010039dd4e mkcodecache`_GLOBAL__sub_I_node_native_module.cc [inlined] __cxx_global_var_init at node_native_module.cc:22 [opt]
frame #3: 0x000000010039dd4e mkcodecache`_GLOBAL__sub_I_node_native_module.cc at node_native_module.cc:0 [opt]
frame #4: 0x00000001026b5cc8 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 518
frame #5: 0x00000001026b5ec6 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
frame #6: 0x00000001026b10da dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 358
frame #7: 0x00000001026b0254 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 134
frame #8: 0x00000001026b02e8 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 74
frame #9: 0x000000010269f774 dyld`dyld::initializeMainExecutable() + 199
frame #10: 0x00000001026a478f dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 6237
frame #11: 0x000000010269e4f6 dyld`dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) + 1154
frame #12: 0x000000010269e036 dyld`_dyld_start + 54
If we look in src/node_native_module.cc
we have a global named instance_
:
NativeModuleLoader NativeModuleLoader::instance_;
This will call NativeModuleLoader's constructor which is what calls LoadJavaScriptSource
which populates the source_
variable.
out/Release/obj/gen/node_javascript.cc
includes node_native_module.h
which is
how this works. Notice that source_
is of type:
using NativeModuleRecordMap = std::map<std::string, UnionBytes>
...
NativeModuleRecordMap source_;
void NativeModuleLoader::LoadJavaScriptSource() {
source_.emplace("internal/bootstrap/environment", UnionBytes{internal_bootstrap_environment_raw, 374});
So we can now see how the source_ variable is getting populated and that we have only been dealing with transforming the JavaScript sources into C byte arrays, there has been no compilation or caching yet.
Previously, the code caching was controlled by the --code-cache-path configure option. This was a path but it is now just hard coded as yes in configure.py:
o['variables']['node_code_cache_path'] = 'yes'
So it will always be set regardless and what ever is specified as the path (it takes an option) will be ignored.
There is a GYP action named 'mkcodecache' which is a dependency on the node target.
$ /Users/danielbevenius/work/nodejs/node-poc/out/Release/mkcodecache /Users/danielbevenius/work/nodejs/node-poc/out/Release/obj/gen/node_code_cache.cc
Lets debug this and take a closer look at what is happening:
$ lldb out/Release/mkcodecache out/Release/obj/gen/node_code_cache.cc
(lldb) br s -n main
First a V8 isolate is created and initialized. We later have the following function call:
std::string cache = CodeCacheBuilder::Generate(context);
This function can be found in tools/code_cache/cache_builder.cc
.
NativeModuleLoader* loader = NativeModuleLoader::GetInstance();
std::vector<std::string> ids = loader->GetModuleIds();
GetModuleIds will iterate through the source_ map and add the keys to the ids vector:
(lldb) expr ids
(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >) $2 = size=203 {
[0] = "_http_agent"
[1] = "_http_client"
[2] = "_http_common"
[3] = "_http_incoming"
[4] = "_http_outgoing"
...
In the loop there is the following call:
if (loader->CanBeRequired(id.c_str()))
CanBeRequired
is
bool NativeModuleLoader::CanBeRequired(const char* id) {
return GetCanBeRequired().count(id) == 1;
}
There are some modules that are not allowed to be required, and in some configurations like when configuring --without-ssl there are some that should not be allowed to be required. This is what this check is about.
Next, the module will be compiled:
std::map<std::string, ScriptCompiler::CachedData*> data;
...
for (const auto& id : ids) {
// TODO(joyeecheung): we can only compile the modules that can be
// required here because the parameters for other types of builtins
// are still very flexible. We should look into auto-generating
// the paramters from the source somehow.
if (loader->CanBeRequired(id.c_str())) {
NativeModuleLoader::Result result;
USE(loader->CompileAsModule(context, id.c_str(), &result));
ScriptCompiler::CachedData* cached_data = loader->GetCodeCache(id.c_str());
if (cached_data == nullptr) {
// TODO(joyeecheung): display syntax errors
std::cerr << "Failed to complile " << id << "\n";
} else {
data.emplace(id, cached_data);
}
}
}
Then the actual code cache is generated by calling:
return GenerateCodeCache(data, log_progress);
This can also be found in cache_builder.cc and
ss << R"(#include <cinttypes>
#include "node_native_module_env.h"
// This file is generated by tools/mkcodecache
// and is used when configure is run with \`--code-cache-path\`
namespace node {
namespace native_module {
const bool has_code_cache = true;
)";
Notice this usage of a Raw String in which escape characters will not be processed and whitespace will be preserved too. So this is what will end up in generated string (later written to a obj/gen/node_code_cache.cc) by mkcodecache.cc.
When is the cache used? During the startup process of node we have:
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
* frame #0: 0x000000010195c174 node`node::native_module::NativeModuleEnv::InitializeCodeCache() at node_code_cache.cc:650
frame #1: 0x000000010014c5c3 node`node::InitializeNodeWithArgs(argv=0x00007ffeefbfe9c8 size=1, exec_argv=0x00007ffeefbfe9e0 size=0, errors=0x00007ffeefbfe570 size=0) at node.cc:908
frame #2: 0x000000010014d454 node`node::InitializeOncePerProcess(argc=1, argv=0x0000000103f16840) at node.cc:981
frame #3: 0x000000010014e1b4 node`node::Start(argc=1, argv=0x00007ffeefbfeb38) at node.cc:1039
frame #4: 0x000000010195c15e node`main(argc=1, argv=0x00007ffeefbfeb38) at node_main.cc:126
frame #5: 0x00007fff5d2f7085 libdyld.dylib`start + 1
frame #6: 0x00007fff5d2f7085 libdyld.dylib`start + 1
So InitializeNodeWithArgs
has the following call:
NativeModuleEnv::InitializeCodeCache();
The implementation for this function can be found in out/Debug/obj/gen/node_code_cache.cc.
NativeModuleCacheMap& code_cache = *NativeModuleLoader::GetInstance()->code_cache();
(lldb) expr code_cache
(node::native_module::NativeModuleCacheMap) $0 = size=0 {}
The rest of the InitializeCodeCache will populate the cache with statements like:
code_cache.emplace(
"_http_agent",
std::make_unique<v8::ScriptCompiler::CachedData>(
_http_agent,
static_cast<int>(arraysize(_http_agent)), policy
)
);
After this there are some more things done but they are not related to caching. Instead we will be back in node::Start
{
Isolate::CreateParams params;
// TODO(joyeecheung): collect external references and set it in
// params.external_references.
std::vector<intptr_t> external_references = { reinterpret_cast<intptr_t>(nullptr)};
v8::StartupData* blob = NodeMainInstance::GetEmbeddedSnapshotBlob();
const std::vector<size_t>* indexes = NodeMainInstance::GetIsolateDataIndexes();
if (blob != nullptr) {
params.external_references = external_references.data();
params.snapshot_blob = blob;
}
NodeMainInstance main_instance(¶ms,
uv_default_loop(),
per_process::v8_platform.Platform(),
result.args,
result.exec_args,
indexes);
result.exit_code = main_instance.Run();
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
* frame #0: 0x000000010024129c node`node::NodeMainInstance::CreateMainEnvironment(this=0x00007ffeefbfe8c0, exit_code=0x00007ffeefbfe5a4) at node_main_instance.cc:208
frame #1: 0x00000001002404d5 node`node::NodeMainInstance::Run(this=0x00007ffeefbfe8c0) at node_main_instance.cc:101
frame #2: 0x000000010014e45d node`node::Start(argc=1, argv=0x00007ffeefbfeb38) at node.cc:1064
frame #3: 0x000000010195c15e node`main(argc=1, argv=0x00007ffeefbfeb38) at nod
if (env->RunBootstrapping().IsEmpty()) {
*exit_code = 1;
}
This will bring us back in node.cc:
MaybeLocal<Value> Environment::RunBootstrapping() {
EscapableHandleScope scope(isolate_);
CHECK(!has_run_bootstrapping_code());
if (BootstrapInternalLoaders().IsEmpty()) {
return MaybeLocal<Value>();
}
Local<Value> result;
if (!BootstrapNode().ToLocal(&result)) {
return MaybeLocal<Value>();
}
// Make sure that no request or handle is created during bootstrap -
// if necessary those should be done in pre-execution.
// TODO(joyeecheung): print handles/requests before aborting
CHECK(req_wrap_queue()->IsEmpty());
CHECK(handle_wrap_queue()->IsEmpty());
set_has_run_bootstrapping_code(true);
return scope.Escape(result);
}
Lets take a closer look at BootstrapInternalLoaders which will first set up the arguments for calling ExecuteBootstrapper:
// Create binding loaders
std::vector<Local<String>> loaders_params = {
process_string(),
FIXED_ONE_BYTE_STRING(isolate_, "getLinkedBinding"),
FIXED_ONE_BYTE_STRING(isolate_, "getInternalBinding"),
primordials_string()};
std::vector<Local<Value>> loaders_args = {
process_object(),
NewFunctionTemplate(binding::GetLinkedBinding)->GetFunction(context()).ToLocalChecked(),
NewFunctionTemplate(binding::GetInternalBinding)->GetFunction(context()).ToLocalChecked(),
primordials()};
if (!ExecuteBootstrapper(
this, "internal/bootstrap/loaders", &loaders_params, &loaders_args)
.ToLocal(&loader_exports)) {
return MaybeLocal<Value>();
}
Notice that we are passing internal/bootstrap/loaders
into this with the arguments.
MaybeLocal<Value> ExecuteBootstrapper(Environment* env,
const char* id,
std::vector<Local<String>>* parameters,
std::vector<Local<Value>>* arguments) {
MaybeLocal<Function> maybe_fn = NativeModuleEnv::LookupAndCompile(env->context(), id, parameters, env);
Now, this is calling into the NativeModuleEnv class, which will use the
source_
map to try to find the module:
const auto source_it = source_.find(id);
...
std::string filename_s = id + std::string(".js");
(lldb) expr filename_s
(std::__1::string) $7 = "internal/bootstrap/loaders.js"
There is --link-module configuration option available when building Node which adds a variable (a gyp python build variable that is) named library_files. This is just like adding a file to node.gyp and the library_files section.
$ ./configure --link-module=beve.js --debug
$ NODE_DEBUG=mkcodecache out/Debug/mkcodecache out/Debug/obj/gen/node_code_cache.c
NativeModuleLoader::InitializeModuleCategories we can find how various modules in the lib directory are filtered, or really not included depending on configuration parameters.
module_categories_.cannot_be_required = std::set<std::string> {
...
}
Node snapshot: Much like the run_mkcodecache there is a node_mksnapshot action in node.gpy. This will be run as part of the build.
$ ./out/Debug/node_mksnapshot
Usage: ./out/Debug/node_mksnapshot <path/to/output.cc>
$ lldb -- out/Debug/node --inspect-brk --inspect-brk-node testing.js
(lldb) br s -f node.cc -l 1722
Breakpoint 1: where = node`node::GetInternalBinding(v8::FunctionCallbackInfo<v8::Value> const&) + 448 at node.cc:1722, address = 0x00000001000c1e80
(lldb) r
The --inspect-brk-node
flag will allow us to break in node's bootstrap javascript code.
When we run this we don't need to compile the fs
module as bindingObj will already contain that module.
bindingObj
is created in lib/internal/bootstrap/loaders.js
:
const bindingObj = ObjectCreate(null);
...
const codeCache = getInternalBinding('code_cache');
...
The result of calling getInternalBinding
is what DefineCodeCache
returns in GetInternalBinding
in node.cc:
if (mod != nullptr) {
exports = InitModule(env, mod, module);
} else if (!strcmp(*module_v, "code_cache")) {
// internalBinding('code_cache')
exports = Object::New(env->isolate());
DefineCodeCache(env, exports);
} else {
...
DefineCodeCache
can be found in out/Debug/obj/gen/node_code_cache.cc
and that function will populate the exports object with the compiled
modules code.
An entry from codeCache
is later in loader.js and passed into the ContextifyScript constructor:
const script = new ContextifyScript(
source, this.filename, 0, 0,
codeCache[this.id], false, undefined
);
WASM InstantiateStreaming
So the
isolate->SetWasmCompileStreamingCallback(WasmInstantiateStreamingCallback);
Does this allow us to inspect the first value passed to instantiateStreaming:
WebAssembly.instantiateStreaming(promise, {}).then((results) => {
assert.strictEqual(
results.instance.exports.addTwo(10, 20),
30,
'Exported function should add two numbers.',
);
});
If we pass in a string it will be the value or args
in the callback:
void WasmInstantiateStreamingCallback(const FunctionCallbackInfo<Value>& args) {
std::cout << "WasmInstantiateStreamingCallback..." << '\n';
auto value = args[0];
}
So this would allow us to check the type of the value and make sure that it is a promise which is the only thing that would be supported in node (there is no support for a Response object in node).
We should resolve this promise to get the source and then set that as the result.
deps/v8/src/wasm/wasm-js.cc
:
ASSIGN(Promise::Resolver, resolver, Promise::Resolver::New(context));
#define ASSIGN(type, var, expr) \
Local<type> var; \
do { \
if (!expr.ToLocal(&var)) { \
DCHECK(i_isolate->has_scheduled_exception()); \
return; \
} else { \
DCHECK(!i_isolate->has_scheduled_exception()); \
} \
} while (false)
So that will be expanded by the preprocessor to:
Local<Promise::Resolver> resolver;
if (Promise::Resolver::New(context).ToLocal(&resolver)) {
DCHECK(i_isolate->has_scheduled_exception());
return;
} else {
DCHECK(!i_isolate->has_scheduled_exception());
}
Node/V8 build
I've noticed that even if nothing is changed (really nothing) then V8 will rebuild it's snapshot in some cases which causes the build time to increase. I'm using the following command to build:
$ make -C out BUILDTYPE=Debug -j8
When run like this without a target the all
target will be executed as it is the first target in the makefile:
all: out/Makefile $(NODE_EXE)
NODE_EXE
is a phony target and will always be run.
$ make builddir="$(PWD)/Release" obj="$(PWD)/Release/obj" -f deps/v8/gypfiles/mksnapshot.target.mk
make: Nothing to be done for `/Users/danielbevenius/work/nodejs/node/out/Release/obj.target/mksnapshot/deps/v8/src/snapshot/mksnapshot.o'.
Run single JS test
To run a single JavaScript test through using python:
$ python tools/test.py --mode=release test/pseudo-tty/test-async-wrap-getasyncid-tty.js
This can be useful when you have a test that fails but passes when run with the node executable.
ARM
aarch64 is a 64-bit state
[root@7dd51277f95b node-v10.9.0-rh]# uname -m
aarch64
On a aarch64
system I can run the complete build with tests without specifying the Makefile variable DESTCPU
.
In the configure script there is a matching of 'aarch64' to 'arm64':
[root@7dd51277f95b node-v10.9.0-rh]# python -c 'from configure import host_arch_cc; print host_arch_cc()'
arm64
But when running the tar-headers
target the DESTCPU
is passed to configure:
$(PYTHON) ./configure \
--prefix=/ \
--dest-cpu=$(DESTCPU) \
...
The value of DESTCPU
in this case will be 'aarch64' which will cause the configure to fail:
configure: error: option --dest-cpu: invalid choice: 'aarch64' (choose from 'arm', 'arm64', 'ia32', 'mips', 'mipsel', 'mips64el', 'ppc', 'ppc64', 'x32', 'x64', 'x86', 'x86_64', 's390', 's390x')
I'm not sure about the reason for this. In our case it would be nice to have consitent behaviour between running configure when running on aarch64
.
DESTCPU
is used in $(TARBALL)-headers
and in $(BINARYTAR)
.
I'm not sure how changing DISTCPU to arm64 when the host arch is aarch64 but opening this commit for feedback from others.
WebAssembly Google V8 Liftoff compiler
Before Liftoff TurboFan was used to compile wasm. TurboFan will still be used as hot code will be recompiled by it but LiftOff will generate code as quickly as possible to avoid the time and memory overhead of constructing the intermediate representation. Wasm is supposed to provide predictable performace (once the module is loaded it should not stall. This was done by V8 by compiling ahead of time. For large WASM source files currently these will be read completly and compiled by V8. With the introduction of LiftOff in V8 the startup time will be reduced as LiftOff will take one pass over the bytecode of the WASM function. "the function body decoder does a single pass over the raw WebAssembly bytes and interacts with the subsequent stage via callbacks, so code generation is performed while decoding and validating the function body." Together with WebAssembly’s streaming APIs, this allows V8 to compile WebAssembly code to machine code while downloading over the network.
Web Hypertext Application Technology Working Group (WHATGW) Streams
Streaming data: that is, data that is created, processed, and consumed in an incremental fashion, without ever reading all of it into memory. The Streams Standard provides a common set of APIs for creating and interfacing with such streaming data, embodied in readable streams, writable streams, and transform streams. Instead of reading all data into memory, data can be read piece by piece. There are two types of streams, readable and writable.
ReadableStream
Represents a resource that can be read from. Readable stream Consumer fs/net -> data, data, data -> Reader
WritableStream
Represents a resource that can be written to.
Pipe chains
ReadableStream.pipeThrough() is used for transformting. ReadableStream.pipeTo() pipes to a writable stream which is the endpoint of the chain.
v8_extras
https://v8project.blogspot.com/2016/02/v8-extras.html "Extras are embedder-provided JavaScript files which are compiled directly into the V8 snapshot."
For this to work you have to specify the files to be compiled using v8_extra_library_files
in common.gypi:
'v8_extra_library_files': [
'../scripts/v8_extras.js'
],
These JavaScript files must follow a specific pattern:
(function(global, binding, v8) {
'use strict';
const Object = global.Object;
const name = v8.createPrivateSymbol('name');
const age = v8.createPrivateSymbol('age');
class Something {
constructor(name, age) {
this[name] = name;
this[age] = age;
}
}
Object.defineProperty(global, 'Something', {
value: Something,
enumerable: false,
configurable: true,
writable: true
});
binding.Something = Something;
});
These functions are compiled by deps/v8/src/bootstrapper.cc
:
bool Bootstrapper::CompileExperimentalExtraBuiltin(Isolate* isolate, int index) {
HandleScope scope(isolate);
Vector<const char> name = ExperimentalExtraNatives::GetScriptName(index);
Handle<String> source_code = isolate->bootstrapper()->GetNativeSource(EXPERIMENTAL_EXTRAS, index);
Handle<Object> global = isolate->global_object();
Handle<Object> binding = isolate->extras_binding_object();
Handle<Object> extras_utils = isolate->extras_utils_object();
Handle<Object> args[] = {global, binding, extras_utils};
return Bootstrapper::CompileNative(isolate, name, source_code, arraysize(args), args, EXTENSION_CODE);
}
Now, we can use anything we bind using:
const s = new Something("Fletch", 43);
console.log(s);
$ ./node ../scripts/extra.js
Something { '43': 43, Fletch: 'Fletch' }
Streams
The highWaterMark represents the number of bytes that the internal buffer can hold. If the stream is in object mode this is instead the total number of objects.
Readable
Take the following example:
var Readable = require('stream').Readable;
var rs = new Readable();
var c = 0;
rs._read = function () {
if (c == 0) {
rs.push("bajja");
c++;
} else {
rs.push(null);
}
};
rs.on('readable', function() {
console.log('readable event [There is data in the stream to be read]');
rs.read();
});
The Readable
constructor can be found in lib/_streams_readable.js
.
function Readable(options) {
const isDuplex = (this instanceof Stream.Duplex);
this._readableState = new ReadableState(options, this, isDuplex);
// legacy
this.readable = true;
if (options) {
if (typeof options.read === 'function')
this._read = options.read;
if (typeof options.destroy === 'function')
this._destroy = options.destroy;
}
Stream.call(this);
}
We can see that a new ReadableState is created using the options passed in in addition to the newly created Readable instance. You can pass in overrides for
_readand
_destroyusing the options object. The final call is Stream constructor using
Stream.call(this)`:
const EE = require('events');
const util = require('util');
function Stream() {
EE.call(this);
}
Next we have the call to on
which registers a listener for the readable
event. Note that this is an overloaded function
in Readable that first delegates to Stream to do the registering of the listener and then has some logic. For a readable event
this logic looks like this:
if (state.flowing !== false)
this.resume();
} else if (ev === 'readable') {
if (!state.endEmitted && !state.readableListening) {
state.readableListening = state.needReadable = true;
state.flowing = false;
state.emittedReadable = false;
debug('on readable', state.length, state.reading);
if (state.length) {
emitReadable(this);
} else if (!state.reading) {
process.nextTick(nReadingNextTick, this);
}
}
}
So what is this flowing mode?
When a ReadableStream is created it is in the paused state. It can be moved in to the flowing mode, where data is automatically read
by calling the ReadableStream's _read
function until this function pushes the null value inte to buffer to signal the end of data
which will cause read
to return null.
So, we we take a look at the ReadableStream above and call resume:
function flow(stream) {
const state = stream._readableState;
debug('flow', state.flowing);
while (state.flowing && stream.read() !== null);
}
When pushing null into the stream the readableAddChunck
function will check if it is null and do:
if (chunk === null) {
state.reading = false;
onEofChunk(stream, state);
} else {
...
}
When stepping through the code there are a number of callbacks added to the nexttick queue, multiple for the same event, for example endReadableNT. Where only the first one will actually do anything and the others. But I guess this is because the state of the stream could have been updated by another event.
In our case endEmitted has not happend, nor is readableListening so both of these statement will be true. The current state.length is 0
so there will not be an emittance of the readable event. And state.reading
is false so we will add nReadingNextTick
to as entry
in the TickObject callback queue.
function nReadingNextTick(self) {
debug('readable nexttick read 0');
self.read(0);
}
The other on
functions basically just register the event handlers specified.
Next, we have:
rs.pipe(process.stdout);
Notice that process.stdout
is a WriteStream and we are setting up a pipe from our readable to the writable using the pipe function.
Now, if the state is not flowing the pipe function will call resume on our readable:
if (!state.flowing) {
debug('pipe resume');
src.resume();
}
Readable.prototype.resume = function() {
var state = this._readableState;
if (!state.flowing) {
debug('resume');
// we flow only if there is no one listening
// for readable, but we still have to call
// resume()
state.flowing = !state.readableListening;
resume(this, state);
}
return this;
};
In our case resume
will not be called and instead we will return. Backing up the callstack will return us to runMain
and:
process._tickCallback();
This will call the nReadingNextTick
function which we can see above will call self.read(0)
.
_read()
is what would fetch data from some underlying resource and push it.
rs.push
will emit the readable
event on the next tick. Unless we add an event listener for this event and call read() to consume
the data nothing else will happen. The data will just stay in the buffer.
$ ./node -p 'require("crypto").constants.defaultCoreCipherList' | tr : '\n'
ECDHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES128-GCM-SHA256
ECDHE-RSA-AES256-GCM-SHA384
ECDHE-ECDSA-AES256-GCM-SHA384
DHE-RSA-AES128-GCM-SHA256
ECDHE-RSA-AES128-SHA256
DHE-RSA-AES128-SHA256
ECDHE-RSA-AES256-SHA384
DHE-RSA-AES256-SHA384
ECDHE-RSA-AES256-SHA256
DHE-RSA-AES256-SHA256
HIGH
!aNULL
!eNULL
!EXPORT
!DES
!RC4
!MD5
!PSK
!SRP
!CAMELLIA
The above are declared in src/node_constants.h
.
Just to recap the cipher suite format, lets take ECDHE-RSA-AES128-GCM-SHA256
ECHDE
is the key exchange algorithm, RSA
the authentication algorithm, AES128
is the bulk encryption algorithm, and
$ sslscan --no-failed localhost:1888
Version: 1.11.11-static
OpenSSL 1.0.2f 28 Jan 2016
Connected to ::1
Testing SSL server localhost on port 1888 using SNI name localhost
TLS Fallback SCSV:
Server supports TLS Fallback SCSV
TLS renegotiation:
Secure session renegotiation supported
TLS Compression:
Compression disabled
Heartbleed:
TLS 1.2 not vulnerable to heartbleed
TLS 1.1 not vulnerable to heartbleed
TLS 1.0 not vulnerable to heartbleed
Supported Server Cipher(s):
Preferred TLSv1.2 128 bits ECDHE-RSA-AES128-GCM-SHA256 Curve P-256 DHE 256
Accepted TLSv1.2 256 bits ECDHE-RSA-AES256-GCM-SHA384 Curve P-256 DHE 256
Accepted TLSv1.2 128 bits ECDHE-RSA-AES128-SHA256 Curve P-256 DHE 256
Accepted TLSv1.2 256 bits ECDHE-RSA-AES256-SHA384 Curve P-256 DHE 256
Accepted TLSv1.2 256 bits ECDHE-RSA-AES256-SHA Curve P-256 DHE 256
Accepted TLSv1.2 128 bits ECDHE-RSA-AES128-SHA Curve P-256 DHE 256
Accepted TLSv1.2 256 bits AES256-GCM-SHA384
Accepted TLSv1.2 128 bits AES128-GCM-SHA256
Accepted TLSv1.2 256 bits AES256-SHA256
Accepted TLSv1.2 128 bits AES128-SHA256
Accepted TLSv1.2 256 bits AES256-SHA
Accepted TLSv1.2 128 bits AES128-SHA
Preferred TLSv1.1 256 bits ECDHE-RSA-AES256-SHA Curve P-256 DHE 256
Accepted TLSv1.1 128 bits ECDHE-RSA-AES128-SHA Curve P-256 DHE 256
Accepted TLSv1.1 256 bits AES256-SHA
Accepted TLSv1.1 128 bits AES128-SHA
Preferred TLSv1.0 256 bits ECDHE-RSA-AES256-SHA Curve P-256 DHE 256
Accepted TLSv1.0 128 bits ECDHE-RSA-AES128-SHA Curve P-256 DHE 256
Accepted TLSv1.0 256 bits AES256-SHA
Accepted TLSv1.0 128 bits AES128-SHA
SSL Certificate:
Signature Algorithm: sha1WithRSAEncryption
RSA Key Strength: 2048
Subject: danbev
Issuer: danbev
Not valid before: Dec 7 07:56:37 2017 GMT
Not valid after: Jan 6 07:56:37 2018 GMT
Show node man page
$ man doc/node.1
Slow debug build
The debug build is very slow. This section exists to troubleshoot the debug build.
$ maked
TOUCH 5a237c891f2234af459ae38bfa47f55ce1bae08e.intermediate
TOUCH 000a2d5af5f332d5cc9c0871728bccfb7db9209c.intermediate
ACTION Generating inspector protocol sources from protocol json definition /Users/danielbevenius/work/nodejs/node/out/Debug/obj/gen/src/js_protocol.stamp
ACTION Generating node protocol sources from protocol json 5a237c891f2234af459ae38bfa47f55ce1bae08e.intermediate
ACTION Generating inspector protocol sources from protocol json 000a2d5af5f332d5cc9c0871728bccfb7db9209c.intermediate
TOUCH 922bdaf2d2769f5377a88efe3ece13a08524d234.intermediate
ACTION _Users_danielbevenius_work_nodejs_node_deps_v8_gypfiles_v8_gyp_v8_torque_host_run_torque 922bdaf2d2769f5377a88efe3ece13a08524d234.intermediate
CXX(target) /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/v8_initializers/gen/torque-generated/builtins-array-from-dsl-gen.o
LIBTOOL-STATIC /Users/danielbevenius/work/nodejs/node/out/Debug/libv8_initializers.a
LINK(target) /Users/danielbevenius/work/nodejs/node/out/Debug/mksnapshot
ACTION _Users_danielbevenius_work_nodejs_node_deps_v8_gypfiles_v8_gyp_v8_snapshot_target_run_mksnapshot /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/v8_snapshot/geni/snapshot.cc
CXX(target) /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/v8_snapshot/geni/snapshot.o
LIBTOOL-STATIC /Users/danielbevenius/work/nodejs/node/out/Debug/libv8_snapshot.a
TOUCH /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/deps/v8/gypfiles/v8_maybe_snapshot.stamp
TOUCH /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/deps/v8/gypfiles/v8.stamp
CXX(target) /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/node_lib/src/node.o
LIBTOOL-STATIC /Users/danielbevenius/work/nodejs/node/out/Debug/libnode.a
LINK(target) /Users/danielbevenius/work/nodejs/node/out/Debug/node
TOUCH /Users/danielbevenius/work/nodejs/node/out/Debug/obj.target/rename_node_bin_win.stamp
LINK(target) /Users/danielbevenius/work/nodejs/node/out/Debug/cctest
ICU compler flags
On my machine (macosx) I see the following compler flags:
ccache clang++ -Qunused-arguments '-D_DARWIN_USE_64_BIT_INODE=1' '-DU_COMMON_IMPLEMENTATION=1' '-DU_I18N_IMPLEMENTATION=1' '-DU_IO_IMPLEMENTATION=1' '-DU_TOOLUTIL_IMPLEMENTATION=1'
'-DU_ATTRIBUTE_DEPRECATED=' '-D_CRT_SECURE_NO_DEPRECATE=' '-DU_STATIC_IMPLEMENTATION=1' '-DUCONFIG_NO_SERVICE=1' '-DU_ENABLE_DYLOAD=0' '-DU_HAVE_STD_STRING=1'
'-DUCONFIG_NO_BREAK_ITERATION=0' -I../deps/icu-small/source/common -I../deps/icu-small/source/i18n -I../deps/icu-small/source/tools/toolutil
-Os -gdwarf-2 -mmacosx-version-min=10.7 -arch x86_64 -Wall -Wendif-labels -W -Wno-unused-parameter -std=gnu++1y -stdlib=libc++ -fno-exceptions
-fno-strict-aliasing
-MMD -MF /Users/danielbevenius/work/nodejs/node-poc/out/Release/.deps//Users/danielbevenius/work/nodejs/node-poc/out/Release/obj.host/icutools/deps/icu-small/source/i18n/regexcmp.o.d.raw
-c -o /Users/danielbevenius/work/nodejs/node-poc/out/Release/obj.host/icutools/deps/icu-small/source/i18n/regexcmp.o ../deps/icu-small/source/i18n/regexcmp.cpp
But on linux for example I don't see -fno-strict-aliasing
which is causing a number of warnings to be generated. This affects us at work where we have a program named RPMDIFF that
will generate an error for these warnings.
Troubleshooting test/addons-napi/test_threadsafe_function/test.js
This test fails when built with --debug
. The error is the following:
$ ./configure --debug && make -j8
$ make build-addons-napi
FATAL ERROR: v8::HandleScope::CreateHandle() Cannot create a handle without a HandleScope
1: 0x10004e287 node::DumpBacktrace(__sFILE*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
2: 0x1000cd37b node::Abort() [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
3: 0x1000cd69f node::OnFatalError(char const*, char const*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
4: 0x10062f171 v8::Utils::ReportApiFailure(char const*, char const*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
5: 0x10063448f v8::Utils::ApiCheck(bool, char const*, char const*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
6: 0x1011aa36f v8::internal::HandleScope::Extend(v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
7: 0x100616118 v8::internal::HandleScope::CreateHandle(v8::internal::Isolate*, v8::internal::Object*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
8: 0x10061606c v8::internal::HandleScope::GetHandle(v8::internal::Isolate*, v8::internal::Object*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
9: 0x100615fc9 v8::internal::HandleBase::HandleBase(v8::internal::Object*, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
10: 0x100629b80 v8::internal::Handle<v8::internal::FixedArray>::Handle(v8::internal::FixedArray*, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
11: 0x100629a75 v8::internal::Handle<v8::internal::FixedArray>::Handle(v8::internal::FixedArray*, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
12: 0x100635960 v8::EmbedderDataFor(v8::Context*, int, bool, char const*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
13: 0x100635d7c v8::Context::SlowGetAlignedPointerFromEmbedderData(int) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
14: 0x10001c26b v8::Context::GetAlignedPointerFromEmbedderData(int) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
15: 0x1000144ea node::Environment::GetCurrent(v8::Local<v8::Context>) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
16: 0x1000f49e2 napi_env__::node_env() const [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
17: 0x1000f9c54 (anonymous namespace)::v8impl::ThreadSafeFunction::CloseHandlesAndMaybeDelete(bool)::'lambda'(uv_handle_s*)::operator()(uv_handle_s*) const [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
18: 0x1000f9b2c void node::Environment::CloseHandle<uv_handle_s, (anonymous namespace)::v8impl::ThreadSafeFunction::CloseHandlesAndMaybeDelete(bool)::'lambda'(uv_handle_s*)>(uv_handle_s*, (anonymous namespace)::v8impl::ThreadSafeFunction::CloseHandlesAndMaybeDelete(bool)::'lambda'(uv_handle_s*))::'lambda'(uv_handle_s*)::operator()(uv_handle_s*) const [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
19: 0x1000f99c8 void node::Environment::CloseHandle<uv_handle_s, (anonymous namespace)::v8impl::ThreadSafeFunction::CloseHandlesAndMaybeDelete(bool)::'lambda'(uv_handle_s*)>(uv_handle_s*, (anonymous namespace)::v8impl::ThreadSafeFunction::CloseHandlesAndMaybeDelete(bool)::'lambda'(uv_handle_s*))::'lambda'(uv_handle_s*)::__invoke(uv_handle_s*) [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
20: 0x101da3906 uv__finish_close [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
21: 0x101da14aa uv__run_closing_handles [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
22: 0x101da1211 uv_run [/Users/danielbevenius/work/nodejs/node-poc/out/Debug/node]
The path taken for a normal/Release build would take would be the following from deps/v8/include/v8.h
:
void* Context::GetAlignedPointerFromEmbedderData(int index) {
#ifndef V8_ENABLE_CHECKS
typedef internal::Internals I;
return I::ReadEmbedderData<void*>(this, index);
#else
return SlowGetAlignedPointerFromEmbedderData(index);
#endif
}
For a debug build the SlowGetAlignedPointerFromEmbedderData
would be called leading to the error above.
The issue here seems to be with this lambda:
env->node_env()->CloseHandle(
reinterpret_cast<uv_handle_t*>(&async),
[](uv_handle_t* handle) -> void {
ThreadSafeFunction* ts_fn =
node::ContainerOf(&ThreadSafeFunction::async,
reinterpret_cast<uv_async_t*>(handle));
ts_fn->env->node_env()->CloseHandle(
reinterpret_cast<uv_handle_t*>(&ts_fn->idle),
[](uv_handle_t* handle) -> void {
ThreadSafeFunction* ts_fn =
node::ContainerOf(&ThreadSafeFunction::idle,
reinterpret_cast<uv_idle_t*>(handle));
ts_fn->Finalize();
});
});
The call to ts_fn->env->node_env()->CloseHandle
will cause this error.
The issue was fixed by adding two HandleScopes, on in the function and then one in the lambda:
void CloseHandlesAndMaybeDelete(bool set_closing = false) {
v8::HandleScope scope(env->isolate);
...
v8::HandleScope scope(ts_fn->env->isolate);
GetHashes
I want to understand what the following function actually does:
void GetHashes(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CipherPushContext ctx(env);
EVP_MD_do_all_sorted(array_push_back<EVP_MD>, &ctx);
args.GetReturnValue().Set(ctx.arr);
}
EVP_MD_do_all_sorted
is a function defined in deps/openssl/openssl/crypto/evp/names.c
:
void EVP_CIPHER_do_all_sorted(void (*fn) (const EVP_CIPHER *ciph,
const char *from, const char *to,
void *x), void *arg) {
...
}
Notice that the first argument is a pointer to a function that returns void, and takes four arguments, and the second
argument is a void pointer (void* arg
).
In this case the function passed as the first argument is array_push_back<EVP_MD>
:
(just showing the CipherPushContext
for completeness)
class CipherPushContext {
public:
explicit CipherPushContext(Environment* env)
: arr(Array::New(env->isolate())),
env_(env) {
}
inline Environment* env() const { return env_; }
Local<Array> arr;
private:
Environment* env_;
};
static void array_push_back(const TypeName* md,
const char* from,
const char* to,
void* arg) {
CipherPushContext* ctx = static_cast<CipherPushContext*>(arg);
ctx->arr->Set(ctx->arr->Length(), OneByteString(ctx->env()->isolate(), from));
}
Notice that this matches the signature of the function parameter definition. So this function will be called by OpenSSL for each hash and then it is added to a V8 Array.
Objects in OpenSSL can have a short name, a long name and a numerical identifier ( NID ) associated with them.
const char* OBJ_nid2sn(int n)
Converts the passed in numeric identifier (NID) to itsshort name. Returns Null if an error is reported.
Crypto::Convert
Converts the EC Diffie-Hellman public key specified by key and curve to the "compressed", "uncompressed", or "hybrid" format.
size_t EC_POINT_point2oct(const EC_GROUP *group, const EC_POINT *point,
point_conversion_form_t form, unsigned char *buf,
size_t len, BN_CTX *ctx)
Password hashing
When storing a users password we first hash it before storing it (remember that hashing is/should be a oneway thing and we cannot unhash it later). When the user logs in we pass the password to the same hash function and compare that with the entry we have stored. So if the database is compromised the only thing in it will be hashes. If two users have the same password the same hash would be stored in the database which can give a hacker some clues. For this we add what is called a salt value which is generated randomly to the users entered password before it is passed to the hash function. We need to store the salt as well as it will be required when the user enters his/her password again and we have to add this to that password before calling the hash function and then comparing the hashes. When the salt is unique for each hash, we inconvenience the attacker by now having to compute a rainbow table for each user hash. The algorithms for hashing are made slow intentionally to withstand brute force attacks. MD5 was considered unsuitable for hashing in 2005. SHA-0 and SHA-1 are no longer considered as secure. When Linkedin got compromised in 2012 most of their passwords were stolen because, at that time they were still using SHA-1 to hash their passwords. Even though SHA-0 and SHA-1 were unsuccessful, SHA-2 still has some strength left with it’s different variations of hash lengths. SHA family has also released SHA-3 improving its predecessors.
PBKDF2 (Password-Based Key Derivation Function2)
In PBKDF2 we can force the algorithm to behave slowly by increasing its iteration count.
const crypto = require('crypto');
const hash = crypto.pbkdf2Sync('testing', 'salt', 2, 20, 'sha256');
console.log(hash.toString('hex'));
bcrypt
Is used for password hashing and is derived from the Blowfish block cipher which uses tables (when hashing) which are reside in memory which means that a certain amount of memory is required.
scrypt
Scrypt is another hashing algorithm which has the same properties as bcrypt, except that when you increase rounds, it exponentially increases calculation time and memory space required to generate the hash. Scrypt was created as response to evolving attacks on bcrypt and is completely unfeasable when using FPGAs or GPUs due to memory constraints. Scrypt requires the storage of a series of intermediate state data snapshots, which are used in further derivation operations. These snapshots, stored in memory, grow exponentially compared when rounds increase. So adding a round, will make it exponentially harder to brute force the password.
FPGA
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".
timingSafeEqual
When comparing MAC hashes, AEAD authentication tags, or other hash values in the context of authentication or integrity checking, it is important not to leak timing information to a potential attacker.
Bytewise memory comparisons (such as memcmp) are usually optimized so that they return a nonzero value as soon as a mismatch is found. This early-return behavior can leak timing information, allowing an attacker to iteratively guess the correct result.
This function uses OpenSSL's CRYPTO_memcmp
function.
class PublicKeyCipher {
public:
typedef int (*EVP_PKEY_cipher_init_t)(EVP_PKEY_CTX* ctx);
typedef int (*EVP_PKEY_cipher_t)(EVP_PKEY_CTX* ctx,
unsigned char* out, size_t* outlen,
const unsigned char* in, size_t inlen);
template <Operation operation,
EVP_PKEY_cipher_init_t EVP_PKEY_cipher_init,
EVP_PKEY_cipher_t EVP_PKEY_cipher>
static bool Cipher(const char* key_pem,
...
env->SetMethod(target, "publicEncrypt",
PublicKeyCipher::Cipher<PublicKeyCipher::kPublic,
EVP_PKEY_encrypt_init,
EVP_PKEY_encrypt>);
So EVP_PKEY_cipher_init_t
is a function pointer to a function that takes
an EVP_PKEY_CTX* and returns an int. And this is being used in a templated
function which is shown above. Notice that this function is getting specialized
with EVP_PKEY_encrypt_init
and EVP-PKEY_encrypt
functions from OpenSSL.
Is it laste used like this:
if (EVP_PKEY_cipher_init(ctx.get()) <= 0)
return false;
if (EVP_PKEY_cipher(ctx.get(), nullptr, out_len, data, len) <= 0)
return false;
}
SecureContext
PKCS12 (Public Key Cryptography Standards)
In cryptography, PKCS #12 defines an archive file format for storing many cryptography objects as a single file. It is commonly used to bundle a private key with its X.509 certificate or to bundle all the members of a chain of trust.
A PKCS #12 file may be encrypted and signed. The internal storage containers, called "SafeBags", may also be encrypted and signed. A few SafeBags are predefined to store certificates, private keys and CRLs. Another SafeBag is provided to store any other data at individual implementer's choice.
PKCS #12 is one of the family of standards called Public-Key Cryptography Standards (PKCS) published by RSA Laboratories. The filename extension for PKCS #12 files is ".p12" or ".pfx".[4] These files can be created, parsed and read out with the OpenSSL pkcs12 command
TLS Session Tickets
The ticket is created by a TLS server and sent to a TLS client. The TLS client presents the ticket to the TLS server to resume a session.
The generation of these ticket values are done on the server side when intializing the SecureContext::Init:
// OpenSSL 1.1.0 changed the ticket key size, but the OpenSSL 1.0.x size was
// exposed in the public API. To retain compatibility, install a callback
// which restores the old algorithm.
if (RAND_bytes(sc->ticket_key_name_, sizeof(sc->ticket_key_name_)) <= 0 ||
RAND_bytes(sc->ticket_key_hmac_, sizeof(sc->ticket_key_hmac_)) <= 0 ||
RAND_bytes(sc->ticket_key_aes_, sizeof(sc->ticket_key_aes_)) <= 0) {
return env->ThrowError("Error generating ticket keys");
}
SSL_CTX_set_tlsext_ticket_key_cb(sc->ctx_.get(), TicketCompatibilityCallback);
The client indicates that it supports this mechanism by including a SessionTicket TLS extension in the ClientHello message. The extension will be empty if the client does not already possess a ticket for the server. The server sends an empty SessionTicket extension to indicate that it will send a new session ticket using the NewSessionTicket handshake message.
The server uses two different keys: one 128-bit key for Advanced Encryption Standard (AES) in Cipher Block Chaining (CBC) mode encryption and one 256-bit key for HMAC-SHA-256
For new sessions tickets, when the client doesn't present a session ticket, or an attempted retreival of the ticket failed, or a renew option was indicated, the callback function will be called with enc equal to 1. The OpenSSL library expects that the function will set an arbitary name, initialize iv, and set the cipher context ctx and the hash context hctx.
The name is 16 characters long and is used as a key identifier
#error directive
The #error directive is used in the code base, for example in node_crypto.cc
#if OPENSSL_VERSION_NUMBER >= 0x10100000L && \
OPENSSL_VERSION_NUMBER < 0x10100070L
#error "OpenSSL 1.1.0 revisions before 1.1.0g are not supported"
#endif
...
}
The #error directive causes the compiler (or preprocessor) to output the error message.
secure-pair.js
This test fails when using OpenSSL 1.?.?
=== release test-benchmark-tls ===
Path: sequential/test-benchmark-tls
(node:3355) [DEP0107] DeprecationWarning: tls.convertNPNProtocols() is deprecated.
_tls_common.js:113
c.context.setCert(cert);
^
Error: error:140AB18F:SSL routines:SSL_CTX_use_certificate:ee key too small
at Object.createSecureContext (_tls_common.js:113:17)
at Object.connect (_tls_wrap.js:1121:48)
at Server.proxy.listen (/builddir/build/BUILD/node-v10.14.0/benchmark/tls/secure-pair.js:42:24)
at Object.onceWrapper (events.js:273:13)
at Server.emit (events.js:182:13)
at emitListeningNT (net.js:1320:10)
at process._tickCallback (internal/process/next_tick.js:63:19)
at Function.Module.runMain (internal/modules/cjs/loader.js:744:11)
at startup (internal/bootstrap/node.js:285:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:739:3)
If we examine
const options = {
key: fs.readFileSync(`${cert_dir}/test_key.pem`),
cert: fs.readFileSync(`${cert_dir}/test_cert.pem`),
ca: [ fs.readFileSync(`${cert_dir}/test_ca.pem`) ],
ciphers: 'AES256-GCM-SHA384',
isServer: true,
requestCert: true,
rejectUnauthorized: true,
};
Lets check the size of test_key.pem
:
$ openssl rsa -in test_key.pem -text -noout
Private-Key: (1024 bit)
modulus:
...
This was due to the crypto-policy on RHEL8 which is DEFAULT (by default) and it will not allow key sizes smaller than 2048. This can be worked around by setting the policy to legacy:
$ update-crypto-policies --set LEGACY
or optionally setting the following environment variable:
$ export OPENSSL_TLS_SECURITY_LEVEL=1
Another way could possibly be to set the security level for OpenSSL:
SSL_CTX_set_security_level(sc->ctx_.get(), 1);
Not sure if this is something we would want to do but I'm sticking this here just in case it migth be needed later.
Inspect
util.inspect.defaultOptions.showHidden = true
ssl_wrap
When is the constructor of SSLWrap called?
TLSSocket.prototype._wrapHandle = function(wrap) {
var handle;
...
// Wrap socket's handle
const context = options.secureContext ||
options.credentials ||
tls.createSecureContext(options);
const externalStream = handle._externalStream;
const res = tls_wrap.wrap(externalStream,
context.context,
!!options.isServer);
The wrap function is defined in src/tls_wrap.cc
:
void TLSWrap::Initialize(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
Environment* env = Environment::GetCurrent(context);
env->SetMethod(target, "wrap", TLSWrap::Wrap);
Local<External> stream_obj = args[0].As<External>();
Local<Object> sc = args[1].As<Object>();
Kind kind = args[2]->IsTrue() ? SSLWrap<TLSWrap>::kServer :
SSLWrap<TLSWrap>::kClient;
StreamBase* stream = static_cast<StreamBase*>(stream_obj->Value());
CHECK_NOT_NULL(stream);
TLSWrap* res = new TLSWrap(env, kind, stream, Unwrap<SecureContext>(sc));
args.GetReturnValue().Set(res->object());
RHEL8 issue
I'm seeing a failure when running test/parallel/test-tls-handshake-error.js
on
RHEL8 (dynamically linking to the FIPS compatible OpenSSL library). The error is
:
The test in question looks like it is intended to test a handshake error where the cipher passed in is not available.
These are the ciphers available on the current master:
$ ./out/Release/openssl-cli ciphers -s | tr ':' '\n'
ECDHE-ECDSA-AES256-GCM-SHA384
ECDHE-RSA-AES256-GCM-SHA384
DHE-RSA-AES256-GCM-SHA384
ECDHE-ECDSA-CHACHA20-POLY1305
ECDHE-RSA-CHACHA20-POLY1305
DHE-RSA-CHACHA20-POLY1305
ECDHE-ECDSA-AES128-GCM-SHA256
ECDHE-RSA-AES128-GCM-SHA256
DHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES256-SHA384
ECDHE-RSA-AES256-SHA384
DHE-RSA-AES256-SHA256
ECDHE-ECDSA-AES128-SHA256
ECDHE-RSA-AES128-SHA256
DHE-RSA-AES128-SHA256
ECDHE-ECDSA-AES256-SHA
ECDHE-RSA-AES256-SHA
DHE-RSA-AES256-SHA
ECDHE-ECDSA-AES128-SHA
ECDHE-RSA-AES128-SHA
DHE-RSA-AES128-SHA
AES256-GCM-SHA384
AES128-GCM-SHA256
AES256-SHA256
AES128-SHA256
AES256-SHA
AES128-SHA
And these are the ciphers avilable on RHEL8:
bash-4.4# openssl ciphers -s | tr ':' '\n'
TLS_AES_256_GCM_SHA384
TLS_CHACHA20_POLY1305_SHA256
TLS_AES_128_GCM_SHA256
TLS_AES_128_CCM_SHA256
ECDHE-ECDSA-AES256-GCM-SHA384
ECDHE-RSA-AES256-GCM-SHA384
ECDHE-ECDSA-CHACHA20-POLY1305
ECDHE-RSA-CHACHA20-POLY1305
ECDHE-ECDSA-AES256-CCM
ECDHE-ECDSA-AES128-GCM-SHA256
ECDHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES128-CCM
ECDHE-ECDSA-AES128-SHA256
ECDHE-RSA-AES128-SHA256
ECDHE-ECDSA-AES256-SHA
ECDHE-RSA-AES256-SHA
ECDHE-ECDSA-AES128-SHA
ECDHE-RSA-AES128-SHA
ECDHE-ECDSA-RC4-SHA
ECDHE-RSA-RC4-SHA
ECDHE-ECDSA-DES-CBC3-SHA
ECDHE-RSA-DES-CBC3-SHA
AES256-GCM-SHA384
AES256-CCM
AES128-GCM-SHA256
AES128-CCM
AES256-SHA256
AES128-SHA256
AES256-SHA
AES128-SHA
RC4-SHA
DES-CBC3-SHA
DHE-DSS-AES256-GCM-SHA384
DHE-RSA-AES256-GCM-SHA384
DHE-RSA-CHACHA20-POLY1305
DHE-RSA-AES256-CCM
DHE-DSS-AES128-GCM-SHA256
DHE-RSA-AES128-GCM-SHA256
DHE-RSA-AES128-CCM
DHE-RSA-AES256-SHA256
DHE-DSS-AES256-SHA256
DHE-RSA-AES128-SHA256
DHE-DSS-AES128-SHA256
DHE-RSA-AES256-SHA
DHE-DSS-AES256-SHA
DHE-RSA-AES128-SHA
DHE-DSS-AES128-SHA
DHE-RSA-DES-CBC3-SHA
DHE-DSS-DES-CBC3-SHA
Notice that RC4
is not available on the current master but is on RHEL.
So instead of getting a no cipher match
match error the error will instead be
:
{ Error: Client network socket disconnected before secure TLS connection was established
at TLSSocket.onConnectEnd (_tls_wrap.js:1184:19)
at Object.onceWrapper (events.js:276:13)
at TLSSocket.emit (events.js:193:15)
at endReadableNT (_stream_readable.js:1130:12)
at processTicksAndRejections (internal/process/next_tick.js:76:17)
code: 'ECONNRESET',
path: undefined,
host: undefined,
port: 39617,
localAddress: undefined }
Welcome to the Node.js Technical Steering Commitee Meeting January 16 2019 Lets go the first topic of the agenda
- Start with public announcements?
The NODE_MODULE_VERSION issue
So the issue was that Electron 3 and 4 were using the same ABI number (64). Electron embeds node and has prebuilt packages. These packages work with Electron 3 but cause the dynamic linker to fail to bind symbols when loaded in Electron 4. The changes between Electron 3 and 4 are switching to GN and from OpenSSL to BoringSSL.
The request is to be able to have a specific reserved NODE_MODULE_VERSION that an embedder/distro can use to identify their specific version. This would allow packages to specify this module version for packages/addons and they would be rejected by an incorrect version.
I'm guessing here as I've not used node-pre-gyp
that packages could then
be built specifying a unique NODE_MODULE_VERSION for Electron 4 and have them
built against that version.
So say electron embeds (compiles into its application) and it will have the NODE_MODULE_VERSION specified. A package/module compiled against that version would have the same version compiled into it.
Next a new electron version is released with a later version of node, but there have not been any breaking APIs on the C++ side in node so the NODE_MODULE_VERSION has not changed. But there are have been ABI breaking changes to Electrons dependencies (OpenSSL/BoringSSL, V8) which cause the application to report dynamic linking errors when trying to bind symbols.
With a electron specific NODE_MODULE_VERSION they could update this to indicate when such breaking ABI changes have been made and have node reject them before trying to dynamically link them. Is this correct?
There are currently two suggestions about how to manage these version, one being in src/node_versions.h and the other in a document which gets updated.
Lets take RHEL8 as an example. We are going to dynamically link to the system provided FIPS compatible OpenSSL library. And lets say the version is v12.0.0-pre. We compile our addon/native module against this version. Our addon also uses OpenSSL for something so it includes some OpenSSL headers. This addons is then used against v12.0.0-pre but this version was statically linked against the OpenSSL version that was shipped with Node (at this time 1.1.0j). Now, node will not object to loading this addon as the NODE_MODULE_VERSION will match but the dynamic linker migth not not be able to load symbols as there would be changes between. For RHEL8 OpenSSL should be ABI complient with 1.1.0 so I don't think this would be easy to detect, but
In file included from ../src/node.h:63, from ../test/cctest/test_node_postmortem_metadata.cc:2: ../deps/v8/include/v8.h: In instantiation of 'void v8::PersistentBase<T>::SetWeak(P*, typename v8::WeakCallbackInfo<P>::Callback, v8::WeakCallbackType) [with P = node::BaseObject; T = v8::Object; typename v8::WeakCallbackInfo<P>::Callback = void (*)(const v8::WeakCallbackInfonode::BaseObject&)]':
../src/base_object-inl.h:104:42: required from here ../deps/v8/include/v8.h:9453:16: warning: cast between incompatible function types from 'v8::WeakCallbackInfonode::BaseObject::Callback' {aka 'void ()(const v8::WeakCallbackInfonode::BaseObject&)'} to 'Callback' {aka 'void ()(const v8::WeakCallbackInfo<void>&)'} [-Wcast-function-type] reinterpret_cast<Callback>(callback), type); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I'm looking at some warnings that are generated on Linux but not on mac from
test/addon/uv-handle-lead/binding.cc
:
NODE_MODULE_INIT(/*exports, module, context*/) {
NODE_SET_METHOD(exports, "leakHandle", LeakHandle);
}
$ gcc -E test/addons/uv-handle-leak/binding.cc -Isrc -Ideps/v8/include -Ideps/uv/include
Below is the output of the pre-processor:
extern "C" __attribute__((visibility("default"))) void node_register_module_v68(
v8::Local<v8::Object> exports,
v8::Local<v8::Value> module,
v8::Local<v8::Context> context); extern "C" {
static node::node_module _module = {
68,
0,
__null
, "test/addons/uv-handle-leak/binding.cc",
__null
, (node::addon_context_register_func) (node_register_module_v68), "NODE_GYP_MODULE_NAME",
__null
,
__null
};
static void _register_NODE_GYP_MODULE_NAME(void) __attribute__((constructor));
static void _register_NODE_GYP_MODULE_NAME(void) {
node_module_register(&_module);
}
}
void node_register_module_v68(v8::Local<v8::Object> exports,
v8::Local<v8::Value> module,
v8::Local<v8::Context> context) {
node::NODE_SET_METHOD(exports, "leakHandle", LeakHandle);
}
The following warning is displayed on linux systems (not on mac though);
make[1]: Entering directory '/node/test/addons/uv-handle-leak/build'
CXX(target) Release/obj.target/binding/binding.o
SOLINK_MODULE(target) Release/obj.target/binding.node
COPY Release/binding.node
make[1]: Leaving directory '/node/test/addons/uv-handle-leak/build'
In file included from ../binding.cc:1:
/node/src/node.h:515:51:
warning: cast between incompatible function types from
'void (*)(v8::Local<v8::Object>, v8::Local<v8::Value>, v8::Local<v8::Context>)'
to
'node::addon_context_register_func' {aka 'void (*)(v8::Local<v8::Object>, v8::Local<v8::Value>, v8::Local<v8::Context>, void*)'} [-Wcast-function-type]
(node::addon_context_register_func) (regfunc), \
^
/node/src/node.h:533:3: note: in expansion of macro 'NODE_MODULE_CONTEXT_AWARE_X'
NODE_MODULE_CONTEXT_AWARE_X(modname, regfunc, NULL, 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~~
/node/src/node.h:556:3: note: in expansion of macro 'NODE_MODULE_CONTEXT_AWARE'
NODE_MODULE_CONTEXT_AWARE(NODE_GYP_MODULE_NAME, \
^~~~~~~~~~~~~~~~~~~~~~~~~
../binding.cc:45:1: note: in expansion of macro 'NODE_MODULE_INIT'
NODE_MODULE_INIT(/*exports, module, context*/) {
^~~~~~~~~~~~~~~~
node::addon_context_register_func
is defined as:
typedef void (*addon_context_register_func)(
v8::Local<v8::Object> exports,
v8::Local<v8::Value> module,
v8::Local<v8::Context> context,
void* priv);
Notice that the last parameter for this function is not specified in the macro
NODE_MODULE_INITIALIZER
:
#define NODE_MODULE_INIT() \
extern "C" NODE_MODULE_EXPORT void \
NODE_MODULE_INITIALIZER(v8::Local<v8::Object> exports, \
v8::Local<v8::Value> module, \
v8::Local<v8::Context> context); \
NODE_MODULE_CONTEXT_AWARE(NODE_GYP_MODULE_NAME, \
NODE_MODULE_INITIALIZER) \
NODE_MODULE_INITIALIZER(v8::Local<v8::Object> exports, \
v8::Local<v8::Value> module, \
v8::Local<v8::Context> context)
Adding the last parameter void* priv
to the last NODE_MODULE_INITIALIZER
allows the function signature to match node::addon_context_register_func
:
NODE_MODULE_INITIALIZER(v8::Local<v8::Object> exports, \
v8::Local<v8::Value> module, \
v8::Local<v8::Context> context) \
void* priv)
Building addon in /node/test/addons/make-callback
/node/tools/build-addons.js:58
main(process.argv[3]).catch((err) => setImmediate(() => { throw err; }));
^
Error: spawn /node/out/Release/node EACCES
at Process.ChildProcess._handle.onexit (internal/child_process.js:246:19)
at onErrorNT (internal/child_process.js:422:16)
at processTicksAndRejections (internal/process/next_tick.js:76:17)
make[1]: *** [Makefile:380: test/addons/.buildstamp] Error 1
make[1]: *** Waiting for unfinished jobs....
touch /node/out/Release/obj.target/rename_node_bin_win.stamp
Error handling
This section takes a closer look at node's internal error handling.
Lets take a look at a spcecific error ERR_CRYPTO_FIPS_FORCED
. This is defined in
lib/internal/errors.js:
E('ERR_CRYPTO_FIPS_FORCED',
'Cannot set FIPS mode, it was forced with --force-fips at startup.', Error);
And the function E
looks like this:
function E(sym, val, def, ...otherClasses) {
// Special case for SystemError that formats the error message differently
// The SystemErrors only have SystemError as their base classes.
messages.set(sym, val);
if (def === SystemError) {
def = makeSystemErrorWithCode(sym);
} else {
def = makeNodeErrorWithCode(def, sym);
}
if (otherClasses.length !== 0) {
otherClasses.forEach((clazz) => {
def[clazz.name] = makeNodeErrorWithCode(clazz, sym);
});
}
codes[sym] = def;
}
The first thing that happens is that an entry is added to the messages
map, in
our case:
messages.set('ERR_CRYPTO_FIPS_FORCED', 'Cannot set FIPS mode, it was forced with --force-fips at startup.');
All codes are function (constructors)
../deps/v8/src/inspector/v8-runtime-agent-impl.cc: In member function 'virtual v8_inspector::protocol::Response v8_inspector::V8RuntimeAgentImpl::getIsolateId(v8_inspector::String16*)':
../deps/v8/src/inspector/v8-runtime-agent-impl.cc:626:38: error: expected ')' before 'PRIx64'
std::snprintf(buf, sizeof(buf), "%" PRIx64, m_inspector->isolateId());
~ ^~~~~~~
)
deps/v8/src/inspector/v8-runtime-agent-impl.cc:
#include <inttypes.h>
These are the include paths used by gcc
bash-4.2# echo | g++ -E -Wp,-v -
ignoring nonexistent directory "/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/include-fixed"
ignoring nonexistent directory "/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/include
/usr/local/include
/opt/rh/devtoolset-8/root/usr/include
/usr/include
End of search list.
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "<stdin>"
bash-4.2# find / -name 'inttypes.h'
/opt/rh/devtoolset-8/root/usr/include/c++/8/tr1/inttypes.h
/usr/include/c++/4.8.2/tr1/inttypes.h
/usr/include/inttypes.h
Now, /usr/include/inttypes.h
contains the PRIx64
macro.
This issue can be simulated using:
#include <inttypes.h>
//#include <cinttypes>
#include <stdio.h>
int main(int argc, char** argv) {
uint64_t val = 0x123;
printf("val = 0x%" PRIx64 "\n", val);
}
bash-4.2# g++ -std=c++11 pri.cc -o pri
pri.cc: In function 'int main(int, char**)':
pri.cc:7:21: error: expected ')' before 'PRIx64'
printf("val = 0x%" PRIx64 "\n", val);
~ ^~~~~~~
)
Run preprocessor:
gcc -E pri.cc > output
It looks like /usr/include/inttypes.h is being used:
# 1 "/usr/include/inttypes.h" 1 3 4
And it has PRIx64
:
Adding -D__STDC_FORMAT_MACROS
seems to to the trick:
bash-4.2# g++ -D__STDC_FORMAT_MACROS pri.cc -o pri
But why? Define __STDC_FORMAT_MACROS in order to enable all macros defined by inttypes.h in C++ mode, as was required by the C99 standard and was enforced by multiple old glibc versions.
If we look in /usr/include/inttypes.h:
#if !defined __cplusplus || defined __STDC_FORMAT_MACROS
So if this is not the case PRIx64 and others macros will not be defined.
Inspector
If the build is configured to use V8's inspector protocol the following gyp file will be include:
[ 'v8_enable_inspector==1', {
'includes' : [ 'src/inspector/node_inspector.gypi' ],
The following warning is generated when building node:
/home/danielbevenius/work/nodejs/node/out/Release/obj/gen/src/node/inspector/protocol/Protocol.cpp: In member function ‘virtual std::unique_ptr<node::inspector::protocol::Value> node::inspector::protocol::DictionaryValue::clone() const’:
/home/danielbevenius/work/nodejs/node/out/Release/obj/gen/src/node/inspector/protocol/Protocol.cpp:698:21: error: redundant move in return statement [-Werror=redundant-move]
698 | return std::move(result);
| ~~~~~~~~~^~~~~~~~
/home/danielbevenius/work/nodejs/node/out/Release/obj/gen/src/node/inspector/protocol/Protocol.cpp:698:21: note: remove ‘std::move’ call
/home/danielbevenius/work/nodejs/node/out/Release/obj/gen/src/node/inspector/protocol/Protocol.cpp: In member function ‘virtual std::unique_ptr<node::inspector::protocol::Value> node::inspector::protocol::ListValue::clone() const’:
/home/danielbevenius/work/nodejs/node/out/Release/obj/gen/src/node/inspector/protocol/Protocol.cpp:739:21: error: redundant move in return statement [-Werror=redundant-move]
739 | return std::move(result);
| ~~~~~~~~~^~~~~~~~
Notice that this warning is coming from generated code. These is generated by
files in tools/inspector_protocol/
.
std::unique_ptr<Value> DictionaryValue::clone() const
{
std::unique_ptr<DictionaryValue> result = DictionaryValue::create();
for (size_t i = 0; i < m_order.size(); ++i) {
String key = m_order[i];
Dictionary::const_iterator value = m_data.find(key);
DCHECK(value != m_data.cend() && value->second);
result->setValue(key, value->second->clone());
}
return std::move(result);
}
This is generated from lib/Values_cpp.template
Applying a patch from V8 to node
$ git am -3 --directory=deps/v8
Applying a patch from Node V8 deps to V8
Sometime we need to backport patches and normally this would mean working on the version of V8 that is shipped with Node. To verify that the test compile for this version of V8 it can be useful to apply the patch to a branch of V8 and verify that it compiles and perhaps manually run tests.
$ git apply -p3 ~/work/nodejs/node/aarch64-yarn-master.patch
V8-CI
I keep forgetting how to run this build so documenting it now. Take the following PR: https://github.com/nodejs/node/pull/37225
The build configuration for this task in ci are:
GITHUB_ORG: nodejs
REPO_NAME: node
GIT_REMOTE_REF: refs/pull/37225/head
REBASE_ONTO: origin/v14.x-staging