Skip to content

Conversation

@quge009
Copy link
Collaborator

@quge009 quge009 commented Oct 27, 2025

This PR is mainly about improving the user experience.

  • Changes made to optimize users perceived latency and reading experience:
    • Implement streaming output for LLMSession class, change the final answer generation call to streaming output, to post the answer to user as-soon-as the first few tokens are ready.
    • Implement the push_frontend method to leverage the steaming output to feedback the CoPilot progress status message to user in real-time, to manage users' experience during waiting for the answer.
    • Add auto scroll feature for frontend plugin to enhance readability.
  • Changes made to reduce the average_response_latency (defined as time between question receival and answer posting):
    • Refactor several components' (SmartHelp, LTP, ...) implementation into classes, to make it possible to preserve the states when necessary.
    • Reuse the same llm_session instance for requests within the same conversation, by avoiding unnecessary https re-connection in initialization.
    • Implement a new question parsing function to combine contextualization and classification llm calls into one efficient call, to reduce time.
    • Move prompt reading to instance initialization, to avoid unnecessary file I/O operations.
  • Also a minor bug fix is included:
    • Change the assignment of 'turnId' to frontend.

Effectiveness of this PR:

  • Impact on accuracy
    • No change
  • Impact on response latency
    • ~15% response time reduction on average
    • ~50% response time reduction for extreme simple question

@quge009 quge009 changed the title tmp Improve Performance: CoPilot, response latency, user expectation Oct 28, 2025
@quge009 quge009 changed the title Improve Performance: CoPilot, response latency, user expectation Improve Performance: CoPilot: response latency, user expectation Oct 28, 2025
@quge009 quge009 changed the title Improve Performance: CoPilot: response latency, user expectation Improve Performance: CoPilot: users' perceived response latency Oct 28, 2025
@quge009 quge009 marked this pull request as ready for review October 28, 2025 20:04
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@quge009 quge009 changed the title Improve Performance: CoPilot: users' perceived response latency Improve Performance: CoPilot: users experience Oct 28, 2025

// Process all complete SSE messages in buffer
let sepIndex;
while ((sepIndex = buffer.indexOf('\n\n')) !== -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this will cause infinite loop here when buffer.indexOf('\n\n') !== -1, please make sure that the loop can be breaked from the loop no matter which parts will be executed

SUB_FEATURE = 'ltp'

SKIP_LUCIA_CONTROLLER_EXECUTION = True
class LTP:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, why naming this as LTP? From the comments below, LtpQueryEngine is a better name than the project name. 😁

proxy_send_timeout 2m;
}

location ~ ^/copilot/api/stream(.*)$ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to remove the original part above?

"""
self.llm_session = llm_session
self.feature_skipped = True
self.ltp_documentation = get_prompt_from(os.path.join(PROMPT_DIR, self.SUB_FEATURE, 'ltp_documentation.txt'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the file and find that this function may through an exception. Is that the exception you want in init function?

query, end_time_stamp, parallel, param = gen_promql_query(self.SUB_FEATURE, question, self.llm_session)

if not query:
logger.info(f'No query found in the response, query is {query}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a suggestion and you can keep the code. 😊 Is this a warning instead of info?

help_msg['sku'] +
help_msg['workload'])
else:
self.capability_str = help_msg['feature']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a kind reminder, both verion f3 and f4 use this one?

except Exception as e:
logger.error(f"Failed to parse JSON body for stream_operation: {e}")
return jsonify({"status": "error", "message": "invalid json"}), 400

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove extra empty line

try:
llm_session.clear_instance_stream_callback()
except Exception:
logger.debug('Failed to clear instance stream callback')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if exception happens here?

# verion f3, resolves objective 8 (Lucia Training Platform)
if self._version == 'f3':
if obj.count('8') > 0:
# debug only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove these code which are for debug only?

help_keys = ['unsupported_question']
answer = self.smart_help.generate(question, help_keys, True)
debug = {}
elif obj.count('8') > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see many magic numbers like 8, 3, 9 here, plase add some comments or use const variable with meaningful name to replace them so other developers can understand them well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants