Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udp downstream api - work in progress #28

Closed
wants to merge 0 commits into from

Conversation

alonbg
Copy link

@alonbg alonbg commented Jul 27, 2016

nginx.conf with both tcp and udp test servers

testing udp: not working ( just yet ... )
client: socat -T10 -t10 -d -d -d - udp:127.0.0.1:8070
echo server: socat -d -d -d udp-l:8071,reuseaddr,fork,crlf system:"cat"

testing tcp: working just fine
client: socat -d -d -d - tcp:127.0.0.1:5354
echo server: `socat -d -d -d tcp-l:5355,reuseaddr,fork,crlf system:"cat"

ngx_peer_connection_t *pc;
int n, raw;
ngx_stream_session_t *s;
ngx_peer_connection_t *pc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: indentation problems here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the indentation issues in the rest of your changes. We use 4-space indentation consistently.

@agentzh
Copy link
Member

agentzh commented Jul 27, 2016

Will you add some test cases to the existing test suite? Also, please rebase to the current git master branch in your own branch. Many thanks!

@@ -83,7 +90,7 @@ ngx_stream_lua_inject_socket_udp_api(ngx_log_t *log, lua_State *L)
lua_setfield(L, -2, "udp"); /* ngx socket */

/* udp socket object metatable */
lua_pushlightuserdata(L, &ngx_stream_lua_socket_udp_metatable_key);
lua_pushlightuserdata(L, &ngx_stream_lua_udp_socket_metatable_key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name change is confusing. We regard ngx_stream_lua_socket_ as something like a namespace.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it matchs tcp.
I can change both :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure Test::Nginx has proper support for udp.
new issue under test-nginx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alonbg Yeah, we need to add UDP support to that test class :) Patches welcome! :)

@alonbg
Copy link
Author

alonbg commented Aug 2, 2016

@agentzh
Rebased.
Travis fails on udp tests requiring Nginx::Socket::Lua::Dgram pending a pull request approval :)

if (c->type != SOCK_DGRAM) {
return luaL_error(L, "socket api does not match connection transport",
lua_gettop(L));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation problem here?

@alonbg
Copy link
Author

alonbg commented Aug 7, 2016

all comments committed. rebased.
Test fail on Can't locate Test/Nginx/Socket/Lua/Dgram.pm
Please see test_nginx pull request for that

@alonbg
Copy link
Author

alonbg commented Aug 11, 2016

@agentzh tests still fail. Please advise :)

t/137-udp-count.t
#   Failed test 'TEST 2: entries under the metatable of req sockets - status code ok'
#   at /home/alon/perl5/lib/perl5/Test/Nginx/Socket.pm line 895.
#          got: ''
#     expected: '200'

#   Failed test 'TEST 2: entries under the metatable of req sockets - response_body - response is expected (repeated req 0, req 0)'
#   at /home/alon/perl5/lib/perl5/Test/Nginx/Socket.pm line 1341.
#          got: ''
#     expected: 'n = 4
# '


138-req-udp-socket.t:
TEST 1: sanity - Can't connect to 127.0.0.1:1984: Connection refused
    Retry connecting after 0.675 sec
TEST 1: sanity - Can't connect to 127.0.0.1:1984: Connection refused
    Retry connecting after 0.825 sec
TEST 1: sanity - Can't connect to 127.0.0.1:1984: Connection refused
    Retry connecting after 0.99 sec

.
.

@agentzh
Copy link
Member

agentzh commented Aug 11, 2016

@alonbg The "Connection refused" error reported by the test scaffold is a clear indicator of nginx server crashes (segmentation faults). In this case, we should run that offending test block individually with the valgrind test mode to pinpoint the memory issue in the code base.

@agentzh
Copy link
Member

agentzh commented Aug 12, 2016

@alonbg I'm also getting valgrind memory errors while running your tests on my side. For example,

TEST 1: sanity
==106850== Jump to the invalid address stated on the next line
==106850==    at 0x0: ???
==106850==    by 0x41FAF8: ngx_chain_writer (ngx_output_chain.c:741)
==106850==    by 0x4FD59E: ngx_stream_lua_send_chain_link (ngx_stream_lua_output.c:716)
==106850==    by 0x4FDA53: ngx_stream_lua_ngx_echo (ngx_stream_lua_output.c:227)
==106850==    by 0x4FDB25: ngx_stream_lua_ngx_say (ngx_stream_lua_output.c:36)
==106850==    by 0x5465177: lj_BC_FUNCC (in /opt/luajit21sysm/lib/libluajit-5.1.so.2.1.0)
==106850==    by 0x4EF88C: ngx_stream_lua_run_thread (ngx_stream_lua_util.c:1220)
==106850==    by 0x4EDD47: ngx_stream_lua_content_by_chunk (ngx_stream_lua_contentby.c:167)
==106850==    by 0x4EDEC1: ngx_stream_lua_content_handler_inline (ngx_stream_lua_contentby.c:74)
==106850==    by 0x4EE0CE: ngx_stream_lua_content_handler (ngx_stream_lua_contentby.c:236)
==106850==    by 0x49776A: ngx_stream_init_session (ngx_stream_handler.c:244)
==106850==    by 0x497D94: ngx_stream_init_connection (ngx_stream_handler.c:223)
==106850==    by 0x43B1B1: ngx_event_recvmsg (ngx_event_accept.c:625)
==106850==    by 0x443502: ngx_epoll_process_events (ngx_epoll_module.c:900)
==106850==    by 0x439C1F: ngx_process_events_and_timers (ngx_event.c:242)
==106850==    by 0x442558: ngx_single_process_cycle (ngx_process_cycle.c:309)
==106850==    by 0x41BA20: main (nginx.c:364)
==106850==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==106850==

@agentzh
Copy link
Member

agentzh commented Aug 12, 2016

@alonbg BTW, please rebase to the latest git master for some travis ci fixes (due to recent travis ci server environment changes. alas).

@alonbg
Copy link
Author

alonbg commented Aug 14, 2016

@agentzh
ngx_chain_xxx are used from ngx_stream_lua_ngx_echo (ngx_stream_lua_output.c:227) downwards. This goes for ngx.say(), ngx.print() and ngx.flush().
Can we still use the chain memory management api for udp ?
How do you suggest we handle this ?
Thanks!

@agentzh
Copy link
Member

agentzh commented Aug 14, 2016

@alonbg I think you encountered this while working on ngx_stream_echo_module as well. You cannot use ngx_chain_writer for UDP sends at all. You ended up with calling ngx_udp_send in ngx_stream_echo, remember?

@alonbg
Copy link
Author

alonbg commented Aug 14, 2016

@agentzh yes I remembered that :)
Just hoped for an easier way than refactoring this area.
BTW, I do need to allocate there. What api should I use?
Thanks

@agentzh
Copy link
Member

agentzh commented Aug 15, 2016

@alonbg What API should you use? Sorry, I don't know what kind of refactoring you are going to do. I only have the time to comment on your code changes atm. Please see the existing code for a better idea of how to use the nginx C API (and also those nginx programming tutorials out there on the web).

@agentzh
Copy link
Member

agentzh commented Aug 15, 2016

@alonbg I suggest you skip those ngx.print, ngx.say API functions for the UDP downstream since they make very little sense for this context.

@alonbg
Copy link
Author

alonbg commented Aug 16, 2016

Thanks @agentzh . Going through those tutorials as we speak :). As you suggested, I'll probably skip these functions. If so, I make sure to error handle calls in udp context.

@agentzh
Copy link
Member

agentzh commented Aug 16, 2016

@alonbg Yeah, throwing out a Lua exception in those unsupported Lua APIs is important.

@agentzh
Copy link
Member

agentzh commented Aug 16, 2016

@alonbg Please rebase to the latest master which contains the cjson travis ci fix.

end

for i = 1, 3 do
local data, err, part = sock:receive(5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The receive() method should not take any argument. Datagrams are atomic, unlike streams.

@agentzh
Copy link
Member

agentzh commented Aug 16, 2016

@alonbg I've just made a dirty and quick patch against your branch that makes the simplest test case pass completely on my side. Just for your reference:

diff --git a/src/ngx_stream_lua_socket_udp.c b/src/ngx_stream_lua_socket_udp.c
index d066685..366f7a4 100644
--- a/src/ngx_stream_lua_socket_udp.c
+++ b/src/ngx_stream_lua_socket_udp.c
@@ -1058,7 +1058,7 @@ ngx_stream_lua_socket_udp_send(lua_State *L)

     dd("sending query %.*s", (int) query.len, query.data);

-    n = ngx_send(u->udp_connection.connection, query.data, query.len);
+    n = ngx_udp_send(u->udp_connection.connection, query.data, query.len);

     dd("ngx_send returns %d (query len %d)", (int) n, (int) query.len);

@@ -1087,6 +1087,7 @@ ngx_stream_lua_socket_udp_receive(lua_State *L)
 {
     ngx_stream_session_t                  *s;
     ngx_stream_lua_socket_udp_upstream_t  *u;
+    ngx_connection_t                      *c;
     ngx_int_t                              rc;
     ngx_stream_lua_ctx_t                  *ctx;
     ngx_stream_lua_co_ctx_t               *coctx;
@@ -1156,7 +1157,19 @@ ngx_stream_lua_socket_udp_receive(lua_State *L)
                    "stream lua udp socket receive buffer size: %uz",
                    u->recv_buf_size);

-    rc = ngx_stream_lua_socket_udp_read(s, u);
+    c = u->udp_connection.connection;
+
+    if (u->raw_downstream && !u->connected) {
+        u->received = c->buffer->last - c->buffer->pos;
+        c->buffer->pos =
+            ngx_copy(ngx_stream_lua_socket_udp_buffer, c->buffer->pos, u->received);
+        ngx_stream_lua_socket_udp_handle_success(s, u);
+        u->connected = 1;
+        rc = NGX_OK;
+
+    } else {
+        rc = ngx_stream_lua_socket_udp_read(s, u);
+    }

     if (rc == NGX_ERROR) {
         dd("read failed: %d", (int) u->ft_type);
@@ -1268,7 +1281,7 @@ ngx_stream_lua_socket_udp_finalize(ngx_stream_session_t *s,
         u->resolved->ctx = NULL;
     }

-    if (u->udp_connection.connection) {
+    if (u->udp_connection.connection && !u->raw_downstream) {
         ngx_log_debug0(NGX_LOG_DEBUG_STREAM, s->connection->log, 0,
                        "stream lua close socket connection");

diff --git a/src/ngx_stream_lua_socket_udp.h b/src/ngx_stream_lua_socket_udp.h
index b147b54..e7ca378 100644
--- a/src/ngx_stream_lua_socket_udp.h
+++ b/src/ngx_stream_lua_socket_udp.h
@@ -53,7 +53,8 @@ struct ngx_stream_lua_socket_udp_upstream_s {

     ngx_stream_lua_co_ctx_t           *co_ctx;
     unsigned                           raw_downstream:1;
-    unsigned                           waiting; /* :1 */
+    unsigned                           waiting:1;
+    unsigned                           connected:1;
 };


diff --git a/t/138-req-udp-socket.t b/t/138-req-udp-socket.t
index f143c6e..f2b70fb 100644
--- a/t/138-req-udp-socket.t
+++ b/t/138-req-udp-socket.t
@@ -20,31 +20,33 @@ run_tests();
 __DATA__

 === TEST 1: sanity
+--- ONLY
 --- dgram_server_config
     content_by_lua_block {
         local sock, err = ngx.req.udp_socket()
-        if sock then
-            ngx.say("got the request socket")
-        else
-            ngx.say("failed to get the request socket: ", err)
+        if not sock then
+            ngx.log(ngx.ERR, "failed to get the request socket: ", err)
+            return ngx.exit(ngx.ERROR)
         end

-        for i = 1, 3 do
-            local data, err, part = sock:receive(5)
-            if data then
-                ngx.say("received: ", data)
-            else
-                ngx.say("failed to receive: ", err, " [", part, "]")
-            end
+        local data, err = sock:receive()
+        if not data then
+            ngx.log(ngx.ERR, "failed to receive: ", err)
+            return ngx.exit(ngx.ERROR)
+        end
+
+        print("data: ", data)
+
+        local ok, err = sock:send("received: " .. data)
+        if not ok then
+            ngx.log(ngx.ERR, "failed to send: ", err)
+            return ngx.exit(ngx.ERROR)
         end
     }
 --- dgram_request chomp
 hello world! my
---- dgram_response
-got the request socket
-received: hello
-received:  worl
-received: d! my
+--- dgram_response chomp
+received: hello world! my
 --- no_error_log
 [error]

@splitice
Copy link

What is the status of this PR, is there any assistance needed?

I came across a need for this capability in one of my projects.

@agentzh
Copy link
Member

agentzh commented Aug 17, 2016

@splitice With my dirty patch above, it is good enough for my DNS name server project already. Though it's not good enough for merging into master :)

@splitice
Copy link

Cool,

And FYI - my project is to implement a caching DNS Proxy. Seems like I am seeing a pattern out of everyone who wants this patch.

@agentzh
Copy link
Member

agentzh commented Aug 18, 2016

FYI: https://twitter.com/agentzh/status/766062354928955392

For the best result there, I had to enable the trace stitching feature of LuaJIT 2.1 (via jit.opt.start("minstitch=3")) and also to disable the per-request global environment table creation in ngx_stream_lua_module :) I think we can archive the similar effect by recycling the global environment tables to avoid creating and GC'ing many such (tiny) tables in the formal implementation of both ngx_stream_lua_module and ngx_http_lua_module ;)

@agentzh
Copy link
Member

agentzh commented Aug 18, 2016

Anyway, I'm gonna deploy my pure Lua DNS nameserver atop this NGINX module to power my own openresty.org domain very soon :)

@splitice
Copy link

splitice commented Aug 18, 2016

@agentzh Damn, thats fast. I'm hoping to get at-least that. My ideal target is ~200 - 400K r/s with 4 - 8 workers (assuming high cache hit rate, and some fairly beefy CPUs).

Any chance you plan to opensource your server? It would be very interesting to see how you have architected it vs what I have done so far.

My needs dictate a proxy, rather than a server per-say but there is no reason if open sourced not to extend / re-use if possible,

@agentzh
Copy link
Member

agentzh commented Aug 18, 2016

@splitice Just created a gist containing everything needed to run my Lua DNS server atop NGINX yourself:

https://gist.github.com/agentzh/6c50d37510daef792ed220fa0d970393

It's very hacky though ;)

There is still room for optimizations. See the C-land on-CPU flamegraph while my Lua DNS server atop NGINX is loaded by dnsperf:

http://agentzh.org/misc/flamegraph/lua-dns-server-on-nginx-2016-08-17.svg

Basically with a pure C DNS parser and generator can call them via FFI should be faster. Also, we could recycle the Lua thread object for each coroutine, reducing GC overhead.

Have fun :)

@alonbg
Copy link
Author

alonbg commented Aug 18, 2016

@agentzh absolutely fabulous
@splitice yes, help is needed making it master-merge ready :)
Like you, I also need a [udp] proxy rather than a terminating server and also to manipulate upstream and downstream.
I also need nginx's proxy_bind transparent feature introduced in nginx 1.11.2.
As a result I wouldn't be able to use content_by_lua_block

For the upstream manipulation I thought of making this access_by_lua_block feature work with nginx 1.11.2.

And for the downstream I have no direction yet.
It should be something like filter_datagram_by_lua_block with an optional parameter for number of datagrams to be captured by the filter. Don't know if that's feasible or not.

An alternative to the above would be to implement the proxy_bind feature on the upstream co-sockets. Thus avoiding the us of nginx proxy which in turn would allow the entire downstream/upstream manipulation to occur in content_by_lua_block context.

Any kind of help you can offer would be great

@agentzh
Copy link
Member

agentzh commented Aug 18, 2016

@alonbg You may find this PR for ngx_http_lua_module interesting: openresty/lua-nginx-module#712 It adds the bind() method to cosockets.

@splitice
Copy link

splitice commented Aug 19, 2016

@alonbg I don't think filter_datagram_by_lua_block would be simple to implement, my understanding is that the stream module is structured differently to the the http module.

I'm working under the assumption that (unfortunately) the full manipulation has to occur in the content_by_lua_block.

@agentzh thanks for that, I'll read over it over the weekend.

@alonbg
Copy link
Author

alonbg commented Aug 19, 2016

@agentzh I'll look into the bind PR in ngx_http_lua_module. Thank you.
Though, I do need the IP_TRANSPARENT socket option for non-local addresses.
I'll try adding that as well.

@splitice That's a very important confirmation. Thank you for sharing your thoughts :). I believe I'll continue working under those same assumptions.

@alonbg
Copy link
Author

alonbg commented Aug 22, 2016

@agentzh adopted your dirty patch :)
error handling unsupportted api in udp context
passed all udp test cases
rebased

The tests should pass with my test-nginx bugfix

ngx_stream_lua_socket_udp_handle_success(s, u);
u->connected = 1;
rc = NGX_OK;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: need a blank line before } else {. Please fix other places as well. Thank you.

@alonbg
Copy link
Author

alonbg commented Sep 2, 2016

@agentzh this awaits your approval or is there anything else todo on this pull request?
UDP tests fail as they require latest test-nginx pull request.

@bjne
Copy link

bjne commented Sep 8, 2016

@agentzh Can this be useful for your dns impl:
https://github.com/vavrusa/ljdns

@alonbg alonbg mentioned this pull request Sep 8, 2016
@alonbg
Copy link
Author

alonbg commented Sep 8, 2016

continues in a new pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants